Introduction

In Excel there is a neat Paste Special method named Transpose. The transpose is done by copying your cells and then pasting the data to another place with the transpose option. We can easily do the same with Python and OpenPyXL.

Transposed cells in Excel.

There are currently (To the author's knowledge, December 2019) no transpose functions or methods written in OpenPyXL. I have made a small code sample that transposes rows to columns and columns to rows. Only cell data is transposed with these functions. The functions are divided into three main functions.

The functions

Transpose

The first function is Transpose, which takes an OpenPyXL worksheet and which rows and columns to transpose the values to/from. The function uses two loops, which first traverses the rows and then the columns. We set the worksheet's cell values with ws.cell(row=col,column=row).value = ws.cell(row=row,column=col).value. To set the values, we just invert the column and row values.

Transpose_row_to_col and transpose_col_to_row

I have expanded these functions to accept also the target cell to paste the values in, and also the option to delete the transposed cells values. The functions are identical except for the usage of iter_rows and iter_cols. The target_cell_address is a tuple consisting of the cell's absolute address. If delete_source is True the values in the transposed cells are deleted.

First, an empty list called cell_values is created. After that we use the iter_rows method to iterate through the worksheets rows, one by one. The iter_rows produces cells from the worksheet, by row while the iter_cols returns cells by column. The cells are yielded from a generator.

We append the values of the cells to our cell_ranges list. We then use the fill_cells function to enter the values again to our transposed cells.

I hope that this helps you in transposing and rotating your cell values. As always, if you have any questions or comments please let me know.

Complete code

#!/usr/bin/env python3

"""
Transpose row values to column values and vice versa
"""

import openpyxl

def transpose(ws, min_row, max_row, min_col, max_col):
    for row in range(min_row, max_row+1):
        for col in range(min_col, max_col+1):
            ws.cell(row=col,column=row).value = ws.cell(row=row,column=col).value

def transpose_row_to_col(ws, min_row, max_row, min_col, max_col, target_cell_address=(1,1), delete_source=False):
    cell_values = []
    for row in ws.iter_rows(min_row=min_row, max_row=max_row, min_col=min_col, max_col=max_col):
        for cell in row:
            cell_values.append(cell.value)
            if delete_source:
                cell.value = ""
    fill_cells(ws, target_cell_address[0], target_cell_address[1], cell_values)
        
def transpose_col_to_row(ws, min_row, max_row, min_col, max_col, target_cell_address=(1,1), delete_source=False):
    cell_values = []
    for col in ws.iter_cols(min_row=min_row, max_row=max_row, min_col=min_col, max_col=max_col):
        for cell in col:
            cell_values.append(cell.value)
            if delete_source:
                cell.value = ""
    fill_cells(ws, target_cell_address[0], target_cell_address[1], cell_values)

def fill_cells(ws, start_row, start_column, cell_values):
    row = start_row
    column = start_column
    for value in cell_values:
        ws.cell(row=row,column=column).value = value
        row += 1

if __name__ == "__main__":
    # Open workbook
    file_name = "Transpose_cell_values.xlsx"
    wb = openpyxl.load_workbook(file_name)
    ws1 = wb.worksheets[0]
    transpose(ws1, min_row=1, max_row=1, min_col=1, max_col=5)
    #transpose_col_to_row(ws1, min_row=1, max_row=10, min_col=1, max_col=1, target_cell_address=(1,2))
    wb.save(file_name)

To autofill a column with weekdays in Excel we do not need any shortcuts at all.

Write the day you want to start from in one of your cells. Let’s start on Monday.

Select the fill handle.

Drag the cell down as many days as you like. Excel will automatically fill your days in the cells. You can start with whatever day you like.

I have also made a video that you can find here if you prefer: How to automatically fill a column with days of the week in Excel. The video is also embedded in this post.

Introduction

In this post, we will take a look into how we can generate Extensible Markup Language (XML) files from an Excel file with Python. We will be using the Yattag package to generate our XML file and the OpenPyXL package for reading our Excel data.

Packages

Yattag

Yattag is described in its documentation as following:

Yattag is a Python library for generating HTML or XML in a pythonic way.

That pretty much sums Yatttag up, I find it as a simple, easy to use library that just works. I had been searching for this kind of library in order to more easily generate different XML files.

To install Yattag with pip: pip install yattag

Using Yattag

Adding a tag with Yattag is as easy as using the With keyword:

with tag('h1'):
    text('Hello world!')

Tags are automatically closed. To start using Yattag we need to import Doc from Yattag and create our Doc, tag and text with Doc().tagtext().

from yattag import Doc
doc, tag, text = Doc().tagtext()
with tag('h1'):
    text('Hello world!')
doc.getvalue()

Output:
'<h1>Hello world!</h1>'

OpenPyXL

OpenPyXL is a library for interacting with Excel 2010 files. OpenPyXL can read and write to .xlsx and .xlsm files.

To install OpenPyXL with pip: pip install openpyxl

Using OpenPyXL

To load an existing workbook in OpenPyXl we need to use the load_workbook method. We also need to select the sheet we are reading the data from. In our example, we are using popular baby names in New York City. You can access the dataset from the link at the bottom of this post.

I have created a workbook named NY_baby_names.xlsx with one sheet of data, Sheet1. The worksheet has the following headers: Year of Birth, Gender, Child's First Name, Count, Rank. You can download the Excel file from my website here.

To access the data with OpenPyXL, do the following:

from openpyxl import load_workbook
wb = load_workbook("NY_baby_names.xlsx")
ws = wb.worksheets[0]
for row in ws.iter_rows(min_row=1, max_row=2, min_col=1, max_col=4):
    print([cell.value for cell in row])

Output: 
['Year of Birth', 'Gender', "Child's First Name", 'Count']
[2011, 'FEMALE', 'GERALDINE', 13]

First, we load the workbook with load_workbook, and then select the first worksheet. We then iterate through the first two rows with the iter_rows method.

Generating the XML from Excel

After the imports, we load the workbook and the worksheet. We then create our Yattag document. We fill the headers with Yattags asis() method. The asis method enables us to input any string as the next line.

We then create our main tag, Babies. We start looping through our sheet with the iter_rows method. The iter_rows method returns a generator with all the cells. We use a list comprehension to get all the values from the cells.

Next, we are adding the babies. Notice the use of the With tag and text. When we are finished we indent our result with Yattags indent method.

Finally, we save our file. The output should look like below. Notice that I only included two babies in the output.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"></xs:schema>
<Babies>
    <Baby>
        <Name>
            GERALDINE
        </Name>
        <Gender>
            FEMALE
        </Gender>
        <year>
            2011
        </year>
        <count>
            13
        </count>
        <rank>
            75
        </rank>
    </Baby>
    <Baby>
        <Name>
            GIA
        </Name>
        <Gender>
            FEMALE
        </Gender>
        <year>
            2011
        </year>
        <count>
            21
        </count>
        <rank>
            67
        </rank>
    </Baby>
</Babies>

Complete code

from openpyxl import load_workbook
from yattag import Doc, indent

wb = load_workbook("NY_baby_names.xlsx")
ws = wb.worksheets[0]

# Create Yattag doc, tag and text objects
doc, tag, text = Doc().tagtext()

xml_header = '<?xml version="1.0" encoding="UTF-8"?>'
xml_schema = '<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"></xs:schema>'

doc.asis(xml_header)
doc.asis(xml_schema)

with tag('Babies'):
    # Use ws.max_row for all rows
    for row in ws.iter_rows(min_row=2, max_row=100, min_col=1, max_col=5):
        row = [cell.value for cell in row]
        with tag("Baby"):
            with tag("Name"):
                text(row[2])
            with tag("Gender"):
                text(row[1])
            with tag("year"):
                text(row[0])
            with tag("count"):
                text(row[3])
            with tag("rank"):
                text(row[4])

result = indent(
    doc.getvalue(),
    indentation = '    ',
    indent_text = True
)

with open("baby_names.xml", "w") as f:
    f.write(result)

Data sets

Popular Baby Names dataset: https://catalog.data.gov/dataset/most-popular-baby-names-by-sex-and-mothers-ethnic-group-new-york-city-8c742

I got an excellent question in my OpenPyXL course. The question was how to transfer data from one Excel workbook to another with OpenPyXL. Transferring data only is easier, as there is no need to worry about the formatting of cells and sheets. The complete code is available at the bottom of this post.

Mission statement

The question was how to copy data from one sheet in a workbook to several other sheets in a newly created workbook. The source workbook is WB1 with the source sheet WS1. The source sheet has 1000 rows of data. The data shall be copied to the new workbook WB2 and the sheets WS1 to WS10.

Imports and loading workbook

To copy the values we need to import Workbook and load_workbook from the OpenPyXL library. We can now load our existing workbook, WB1. WB1 = load_workbook("Source.xlsx", data_only=True). The next thing we need to do is set which sheet we are going to copy the data from. We name the sheet WB1_WS1 WB1_WS1 = WB1["WS1"] . After that we are ready to create a new workbook with WB2 = Workbook(). Notice the brackets for the method.

Creating sheets

The question stated that 10 sheets should be created, with the names WS1 to WS10. We can create the sheets with a for loop. Each sheet is created with create_sheet(f"WS{i}"). Notice the usage of Pythons f-string. We then remove the default created sheet, Sheet1. We could also of course have renamed it.

# Create WB2 sheets WS1-WS10
for i in range(1, 11):
    WB2.create_sheet(f"WS{i}")

# delete first sheet
WB2.remove(WB2.worksheets[0])

Copy preparations

The next thing is to create a list for holding our copy ranges, and also which sheets we want to copy the data to. The copy_ranges list holds how many rows we need to copy from the source sheet to the sheets defined in copy_to_sheets.

# Define the copy ranges and sheets
copy_ranges = [100, 200, 50, 300, 350]
copy_to_sheets = ["WS1", "WS2", "WS3", "WS4", "WS4"]

Copying the data

We start with a for loop, that iterates through the copy_ranges list. for i in range( len(copy_ranges)): We then specify which sheet is in turn for copying ws = WB2[ copy_to_sheets[i] ]. Notice how we specify the sheet with the copy_to_sheets list. When i is 1 we select WS1 and so on. We also initialize our row_offset to 1 so that we can keep track of which rows to copy next. We then set the row_offset with yet another for loop. We increase the offset i times with the corresponding values from copy_ranges.

for s in range(i):
    offset += copy_ranges[s]

Now it is time to fill our sheets with data! we traverse through our offset range with a for loop and set the values of the corresponding sheet. First we get the row with for j in range(offset, offset + copy_ranges[i]):. Next up are the cells in each row:

for row in WB1_WS1.iter_rows(min_row=j, max_row=j, min_col=1, max_col=WB1_WS1.max_column):.

We get the values for values_row with a list comprehension [cell.value for cell in row]. Finally, we append the row to the sheet with ws.append(values_row).

# Copy the row with the help of iter_rows, append the row
for j in range(offset,  offset + copy_ranges[i]):
    #if j == 0:
    #    continue
    for row in WB1_WS1.iter_rows(min_row=j, max_row=j, min_col=1, max_col=WB1_WS1.max_column):
        values_row = [cell.value for cell in row]
    ws.append(values_row)

To wrap up, we save the workbook: WB2.save("WB2.xlsx")

That's it! Please comment below if you have questions or any feedback. See you later!

Use this link, Control Excel with Python & OpenPyXL or this code SAVEOPENPYXL to get the course for a discount on Udemy.com

This image has an empty alt attribute; its file name is OpenPyXL_course-1024x407.png

The code

#!/usr/bin/python
# -*- coding: utf-8 -*-

"""
Could you please suggest  how to copy the data from on work book to other book with specified rows
Source: Excel work book "WB1" having work sheet "WS1", This sheet  having 1000 rows of data
Destination: New work book 'WB2' and  work sheets WS1,WS2...WS10
Could you please suggest the code for following condition:
Copy the first 100 rows data and paste it WS1 sheet
Copy the next 200 rows data and paste it WS2 sheet
Copy the next 50 rows data and paste it WS3 sheet
Copy the next 300 rows data and paste it WS4 sheet
Copy the next 350 rows data and paste it WS4 sheet
"""

from openpyxl import Workbook, load_workbook

WB1 = load_workbook("Source.xlsx", data_only=True)
WB1_WS1 = WB1["WS1"]
WB2 = Workbook()

# Create WB2 sheets WS1-WS10
for i in range(1, 11):
    WB2.create_sheet(f"WS{i}")

# delete first sheet
WB2.remove(WB2.worksheets[0])

# Define the copy ranges and sheets
copy_ranges = [100, 200, 50, 300, 350]
copy_to_sheets = ["WS1", "WS2", "WS3", "WS4", "WS4"]

# Copy the values from the rows in WB1 to WB2.
for i in range( len(copy_ranges)):
    # Set the sheet to copy to
    ws = WB2[ copy_to_sheets[i] ]
    # Initialize row offset
    offset = 1
    # Set the row offset
    for s in range(i):
        offset += copy_ranges[s]

    # Copy the row with the help of iter_rows, append the row
    for j in range(offset,  offset + copy_ranges[i]):
        #if j == 0:
        #    continue
        for row in WB1_WS1.iter_rows(min_row=j, max_row=j, min_col=1, max_col=WB1_WS1.max_column):
            values_row = [cell.value for cell in row]
        ws.append(values_row)

# Save the workbook
WB2.save("WB2.xlsx")

Saving an Excel sheet to Pdf with Python

Saving a finished report or table in Excel is easy. You choose SaveAs and save the sheet as Pdf. Doing this automatically with Python is a bit trickier though. In this post, we will take a closer look on how to do this with the win32 library. The full code is available at the bottom of the post. Note that you need Excel installed in order to run this script successfully.

Installing dependencies

install the win32 library first with: pip install pypiwin32. This will install the Win32 Api library, which according to PyPi contains: Python extensions for Microsoft Windows Provides access to much of the Win32 API, the ability to create and use COM objects, and the Pythonwin environment.

File paths

To get the file paths we use pathlib. Pathlib was introduced in Python 3.4 so it is quite new (Using Python 3.8 during the writing of this article). We specify the name of the Excel workbook we want to make a pdf of, and also the output pdf's name.

excel_file = "pdf_me.xlsx"
pdf_file = "pdf_me.pdf"

We then create paths from our current working directory (cwd) with Pathlibs cwd() method.

excel_path = str(pathlib.Path.cwd() / excel_file)
pdf_path = str(pathlib.Path.cwd() / pdf_file)

Firing up Excel

Excel is next up. We start the Excel application and hide it.

excel = client.DispatchEx("Excel.Application")
excel.Visible = 0

We then open our workbook  wb = excel.Workbooks.Open(excel_path) and load our first sheet with ws = wb.Worksheets[1]

Now it is time to use the SaveAs to save our sheet as a pdf. wb.SaveAs(pdf_path, FileFormat=57)Fileformat 57 is the pdf file format.

We then close our workbook and quit our Excel application. Our pdf is now saved in our working directory.

The code

from win32com import client
import win32api
import pathlib

### pip install pypiwin32 if module not found

excel_file = "pdf_me.xlsx"
pdf_file = "pdf_me.pdf"
excel_path = str(pathlib.Path.cwd() / excel_file)
pdf_path = str(pathlib.Path.cwd() / pdf_file)

excel = client.DispatchEx("Excel.Application")
excel.Visible = 0

wb = excel.Workbooks.Open(excel_path)
ws = wb.Worksheets[1]

try:
    wb.SaveAs(pdf_path, FileFormat=57)
except Exception as e:
    print("Failed to convert")
    print(str(e))
finally:
    wb.Close()
    excel.Quit()

 

I have finally published my first course, Control Excel with Python & OpenPyXL!

The course covers the ins and outs of how to control and automate Excel with the OpenPyXL library.

I am offering this course for a discount for you who are reading this post.

Use this link, Control Excel with Python & OpenPyXL or this code SAVEOPENPYXL to get the course for a discount on Udemy.com

Here is a short video on how to map XML schemas to Excel.

Creating and formatting a chart automatically in Excel with VBA

Using Excel involves creating a lot of charts and tables. Most of the time, the operations are really simple but time consuming. Creating and formatting a chart takes time. In this post, we are exploring how we can create a chart automatically with some VBA code. (more…)

Calling a Sub from a Sub in VBA - When you are creating macros you sometimes want to run other functions or subs from your VBA code. One clarification about what a subroutine is: It is a function that does not return anything, or in other words a macro. In this post, we are going to cover how to call a subroutine from another.

In the last post about the Worksheet.Change event, we covered how to catch a change event. We showed a MsgBox and did some changing of values. We could easily have called a subroutine as well. Let's call one from a button.

Sub Our_sub_button()
    I_like_to_be_called
End Sub

Sub I_like_to_be_called()
    MsgBox "Hi, nice to be called!"
End Sub

First, we attach the "Our_sub_button" to a Form button. We then call the "I_like_to_be_called" subroutine from our button. Note that you do not need the Call keyword, but you can use it if you like as below:

Sub Our_sub_button()
    Call I_like_to_be_called
End Sub

Sub I_like_to_be_called()
    MsgBox "Hi, nice to be called!"
End Sub

This is all good Conny, but I want to pass variables also. How do I do that? See below. You just put the variables after the call, omitting the parentheses since we are not using the Call keyword. You also need to create the variables in your subroutine that you are calling.

Sub Our_sub_button()
    I_like_to_be_called "this is ", "my message"
End Sub

Sub I_like_to_be_called(MSG As String, MSG2 As String)
    MsgBox (MSG + MSG2)
End Sub

Now you know how to call a subroutine. Happy coding!

Excel, VBA, and the Worksheet.Change Event - Do you have users that change a worksheet, and at each change, you are required to check if some specific action is required? Then you can use the Worksheet.Change event provided in VBA. You can find more information about the Worksheet change event at Microsoft's Dev Center. In this article, we will code a short Worksheet.Change script. (more…)

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram