pyspark.pandas.DataFrame.to_excel#
- DataFrame.to_excel(excel_writer, sheet_name='Sheet1', na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, startrow=0, startcol=0, engine=None, merge_cells=True, inf_rep='inf', freeze_panes=None)#
Write object to an Excel sheet.
Note
This method should only be used if the resulting DataFrame is expected to be small, as all the data is loaded into the driver’s memory.
To write a single object to an Excel .xlsx file it is only necessary to specify a target file name. To write to multiple sheets it is necessary to create an ExcelWriter object with a target file name, and specify a sheet in the file to write to.
Multiple sheets may be written to by specifying unique sheet_name. With all data written to the file it is necessary to save the changes. Note that creating an ExcelWriter object with a file name that already exists will result in the contents of the existing file being erased.
- Parameters
- excel_writer: str or ExcelWriter object
File path or existing ExcelWriter.
- sheet_name: str, default ‘Sheet1’
Name of sheet which will contain DataFrame.
- na_rep: str, default ‘’
Missing data representation.
- float_format: str, optional
Format string for floating point numbers. For example
float_format="%%.2f"
will format 0.1234 to 0.12.- columns: sequence or list of str, optional
Columns to write.
- header: bool or list of str, default True
Write out the column names. If a list of string is given it is assumed to be aliases for the column names.
- index: bool, default True
Write row names (index).
- index_label: str or sequence, optional
Column label for index column(s) if desired. If not specified, and header and index are True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex.
- startrow: int, default 0
Upper left cell row to dump data frame.
- startcol: int, default 0
Upper left cell column to dump data frame.
- engine: str, optional
Write engine to use, ‘openpyxl’ or ‘xlsxwriter’. You can also set this via the options
io.excel.xlsx.writer
,io.excel.xls.writer
, andio.excel.xlsm.writer
.- merge_cells: bool, default True
Write MultiIndex and Hierarchical Rows as merged cells.
- inf_rep: str, default ‘inf’
Representation for infinity (there is no native representation for infinity in Excel).
- freeze_panes: tuple of int (length 2), optional
Specifies the one-based bottommost row and rightmost column that is to be frozen.
See also
read_excel
Read Excel file.
Notes
Once a workbook has been saved it is not possible write further data without rewriting the whole workbook.
Examples
Create, write to, and save a workbook:
>>> df1 = ps.DataFrame([['a', 'b'], ['c', 'd']], ... index=['row 1', 'row 2'], ... columns=['col 1', 'col 2']) >>> df1.to_excel("output.xlsx")
To specify the sheet name:
>>> df1.to_excel("output.xlsx") >>> df1.to_excel("output.xlsx", ... sheet_name='Sheet_name_1')
If you wish to write to more than one sheet in the workbook, it is necessary to specify an ExcelWriter object:
>>> with pd.ExcelWriter('output.xlsx') as writer: ... df1.to_excel(writer, sheet_name='Sheet_name_1') ... df2.to_excel(writer, sheet_name='Sheet_name_2')
To set the library that is used to write the Excel file, you can pass the engine keyword (the default engine is automatically chosen depending on the file extension):
>>> df1.to_excel('output1.xlsx', engine='xlsxwriter')