当我们在 Python 的 Pandas DataFrame 中处理数据时,经常会遇到时间序列数据。Pandas 是 Python 中处理时间序列数据的强大工具,我们可能需要将给定数据集中的字符串转换为日期时间格式。

在本教程中,我们将学习如何将字符串格式的 DataFrame 列转换为日期时间格式,格式为 "dd/mm/yy"。如果日期不在所需格式中,用户将无法执行任何基于时间序列的操作。为了处理这个问题,我们需要将日期转换为所需的日期时间格式。

在 Python 中转换数据类型格式的不同方法:

在本部分,我们将讨论不同的方法,可以使用这些方法将 Pandas DataFrame 列的数据类型从字符串转换为日期时间格式:

方法 1:使用 pandas.to_datetime() 函数

在这种方法中,我们将使用 "pandas.to_datetime()" 函数来将 Pandas DataFrame 列的数据类型转换为日期时间。

示例:

import pandas as pnd  
   
# Creating the dataframe  
data_frame = pnd.DataFrame({'Date':['12/05/2021', '11/21/2018', '01/12/2020'],  
                'Event':['Music- Dance', 'Poetry- Songs', 'Theatre- Drama'],  
                'Cost':[15400, 7000, 25000]})  
   
# Print the dataframe  
print ("The data is: ")   
print (data_frame)  
   
# Here, we are checking the data type of the 'Date' column  
data_frame.info() 

输出:

The data is: 
         Date           Event                     Cost
0  12/05/2021    Music- Dance  15400
1  11/21/2018   Poetry- Songs   7000
2  01/12/2020  Theatre- Drama  25000

RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Date    3 non-null      object 
 
1   Event   3 non-null      object
 2   Cost    3 non-null      int64 
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes

在输出中,我们可以看到数据框中的“Date”列的数据类型为“object”,这意味着它是一个字符串。现在,我们将使用 "pnd.to_datetime()" 函数将数据类型转换为日期时间格式:

import pandas as pnd  
   
# Creating the dataframe  
data_frame = pnd.DataFrame({'Date':['12/05/2021', '11/21/2018', '01/12/2020'],  
                'Event':['Music- Dance', 'Poetry- Songs', 'Theatre- Drama'],  
                'Cost':[15400, 7000, 25000]})  
   
# Print the dataframe  
print ("The data is: ")   
print (data_frame)  
  
# For converting the 'Date' column of DataFrame into datetime format  
data_frame['Date'] = pnd.to_datetime(data_frame['Date'])  
  
# Here, we are checking the data type of the 'Date' column  
data_frame.info()  

输出:

The data is: 
         Date           Event                     Cost
0  12/05/2021    Music- Dance  15400
1  11/21/2018   Poetry- Songs   7000
2  01/12/2020  Theatre- Drama  25000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
 ---  ------  --------------  -----         
 0   Date    3 non-null      datetime64[ns] 
 
1   Event   3 non-null      object        
 2   Cost    3 non-null      int64         
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes

现在,我们可以看到数据框中的“Date”列的格式已经更改为日期时间格式。

方法 2:使用 DataFrame.astype() 函数

在这种方法中,我们将使用 "DataFrame.astype()" 函数来将 Pandas DataFrame 列的数据类型转换为日期时间。

示例:

import pandas as pnd  
   
# Creating the dataframe  
data_frame = pnd.DataFrame({'Date':['12/05/2021', '11/21/2018', '01/12/2020'],  
                'Event':['Music- Dance', 'Poetry- Songs', 'Theatre- Drama'],  
                'Cost':[15400, 7000, 25000]})  
   
# Print the dataframe  
print ("The data is: ")   
print (data_frame)  
   
# Here, we are checking the data type of the 'Date' column  
data_frame.info()  

输出:

The data is: 
         Date           Event                     Cost
0  12/05/2021    Music- Dance  15400
1  11/21/2018   Poetry- Songs   7000
2  01/12/2020  Theatre- Drama  25000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Date    3 non-null      object 
 
1   Event   3 non-null      object
 2   Cost    3 non-null      int64 
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes

在输出中,我们可以看到数据框中的“Date”列的数据类型为“object”,这意味着它是一个字符串。现在,我们将使用 "DataFrame.astype()" 函数将数据类型转换为日期时间格式:

import pandas as pnd  
   
# Creating the dataframe  
data_frame = pnd.DataFrame({'Date':['12/05/2021', '11/21/2018', '01/12/2020'],  
                'Event':['Music- Dance', 'Poetry- Songs', 'Theatre- Drama'],  
                'Cost':[15400, 7000, 25000]})  
   
# Print the dataframe  
print ("The data is: ")   
print (data_frame)  
# For converting the 'Date' column of DataFrame into datetime format  
data_frame['Date'] = data_frame['Date'].astype('datetime64[ns]')  
   
# Here, we are checking the data type of the 'Date' column  
data_frame.info()

输出:

The data is: 
         Date           Event   Cost
0  12/05/2021    Music- Dance  15400
1  11/21/2018   Poetry- Songs   7000
2  01/12/2020  Theatre- Drama  25000
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   Date    3 non-null      datetime64[ns] 

1   Event   3 non-null      object        
 2   Cost    3 non-null      int64         
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes

现在,我们可以看到数据框中的“Date”列的格式已经更改为日期时间格式,使用 data_frame['Date'].astype('datetime64[ns]'。

方法 3:

假设我们在数据框列中的日期是 "yymmdd" 格式,我们需要将它从字符串转换为日期时间格式。

示例:

import pandas as pnd  
   
# Now, we will initialize the nested list with Dataset  
play_list = [['210302', 67000], ['210901', 62000], ['210706', 61900],  
            ['210402', 59000], ['210802', 74000],   
            ['210804', 54050], ['210109', 57650], ['210509', 67300], ['210209', 76600]]  
   
# Creating a pandas DataFrame  
data_frame = pnd.DataFrame(play_list,columns = ['Date','Patient Number'])  
   
# Print the dataframe  
print ("The data is: ")   
print (data_frame)  
   
# Here, we are checking the data type of the 'Date' column  
print (data_frame.dtypes)  

输出:

The data is: 
     Date          Patient Number
0  210302           67000
1  210901           62000
2  210706           61900
3  210402           59000
4  210802           74000
5  210804           54050
6  210109           57650
7  210509           67300
8  210209           76600
 Date              object 

Patient Number     int64
dtype: object

在输出中,我们可以看到数据框中的“Date”列的数据类型为“object”,这意味着它是一个字符串。现在,我们将使用 "pnd.to_datetime(data_frame['Date'], format = '%y%m%d')" 函数将数据类型转换为日期时间格式。

import pandas as pnd  
   
# Now, we will initialize the nested list with Dataset  
play_list = [['210302', 67000], ['210901', 62000], ['210706', 61900],   
            ['210402', 59000], ['210802', 74000],   
            ['210804', 54050], ['210109', 57650], ['210509', 67300], ['210209', 76600]]  
   
# creating a pandas dataframe  
data_frame = pnd.DataFrame(play_list,columns = ['Date','Patient Number'])  
   
# Print the dataframe  
print ("The data is: ")   
print (data_frame)  
  
# For converting the 'Date' column of DataFrame into datetime format  
data_frame['Date'] = pnd.to_datetime(data_frame['Date'], format = '%y%m%d')  
   
# Here, we are checking the data type of the 'Date' column  
print (data_frame.dtypes)  

输出:

The data is: 
     Date         Patient Number
0  210302           67000
1  210901           62000
2  210706           61900
3  210402           59000
4  210802           74000
5  210804           54050
6  210109           57650
7  210509           67300
8  210209           76600
 Date              datetime64[ns] 

Patient Number             int64
dtype: object

在上面的代码中,我们使用 "pnd.to_datetime(data_frame['Date'], format = '%y%m%d')" 函数将“Date”列的数据类型从“object”更改为“datetime64[ns]”。

方法 4:

我们可以使用 "pandas.to_datetime()" 函数将多列从“字符串”格式转换为“日期时间”格式,即“YYYYMMDD”格式。

# Initializing the nested list with Data set  
Dataset_list = [['20210612', 54000, '20210812'],   
               ['20210814', 65000, '20210614'],   
               ['20210316', 71500, '20210316'],   
               ['20210519', 45000, '20210119'],   
               ['20210221', 98000, '20210221'],   
               ['20210124', 23000, '20210724'],   
               ['20210929', 12000, '20210924']]   
   
# creating a pandas dataframe  
data_frame = pnd.DataFrame(  
  Dataset_list, columns = ['Treatment_starting_Date',  
                         'Patients Number',  
                         'Treatment_ending_Date'])  
   
# Print the dataframe  
print ("The data is: ")   
print (data_frame)  
   
# Here, we are checking the data type of the 'Date' column  
print (data_frame.dtypes)  

输出:

The data is: 
  Treatment_starting_Date   Patients Number     Treatment_ending_Date
0   20210612                54000                20210812
1   20210814                65000                20210614
2   20210316                71500                20210316
3   20210519                45000                20210119
4   20210221                98000                20210221
5   20210124                23000                20210724
6   20210929                12000                20210924
 Treatment_starting_Date    object 

Patients Number             int64
 Treatment_ending_Date      object 

dtype: object

在上述输出中,我们可以看到数据框中的"Date"列的数据类型是"object",这意味着它是一个字符串。现在,我们将使用 "pnd.to_datetime(data_frame[''], format = '%y%m%d')" 函数将"Date"列的数据类型转换为日期时间格式。

import pandas as pnd  
   
# Initializing the nested list with Data set  
Dataset_list = [['20210612', 54000, '20210812'],  
               ['20210814', 65000, '20210614'],  
               ['20210316', 71500, '20210316'],  
               ['20210519', 45000, '20210119'],  
               ['20210221', 98000, '20210221'],  
               ['20210124', 23000, '20210724'],  
               ['20210929', 12000, '20210924']]  
   
# creating a pandas dataframe  
data_frame = pnd.DataFrame(  
  Dataset_list, columns = ['Treatment_starting_Date',  
                         'Patients Number',  
                         'Treatment_ending_Date'])  
   
# Print the dataframe  
print ("The data is: ")   
print (data_frame)  
  
  
# For converting the multiple columns of DataFrame into datetime format  
data_frame['Treatment_starting_Date'] = pnd.to_datetime(  
                          data_frame['Treatment_starting_Date'],  
                          format = '%Y%m%d'  
)  
data_frame['Treatment_ending_Date'] = pnd.to_datetime(  
                          data_frame['Treatment_ending_Date'],  
                          format = '%Y%m%d'  
)  
   
# Here, we are checking the data type of the 'Date' column  
print (data_frame.dtypes)  

输出:

The data is: 
  Treatment_starting_Date  Patients Number Treatment_ending_Date
0                20210612            54000              20210812
1                20210814            65000              20210614
2                20210316            71500              20210316
3                20210519            45000              20210119
4                20210221            98000              20210221
5                20210124            23000              20210724
6                20210929            12000              20210924
 Treatment_starting_Date    datetime64[ns] 

Patients Number                     int64
 Treatment_ending_Date      datetime64[ns] 

dtype: object

在上述输出中,我们可以看到,通过使用 "pnd.to_datetime()" 函数,"治疗开始日期"和"治疗结束日期"的数据类型已经更改为日期时间格式。

结论

在本教程中,我们学习了如何使用Python将Pandas数据框的列类型从字符串转换为日期时间的不同方法。

标签: Tkinter教程, Tkinter安装, Tkinter库, Tkinter入门, Tkinter学习, Tkinter入门教程, Tkinter, Tkinter进阶, Tkinter指南, Tkinter学习指南, Tkinter进阶教程, Tkinter编程