将Timestamp转为datetime类型
在Pandas中我们在处理时间序列的时候常用的方法有:
-
pd.to_datetime()
-
pd.date_range()
pandas生成时间索引
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
|
# pd.date_range() index = pd.date_range( "20210101" ,periods = 20 ) index Out[ 29 ]: DatetimeIndex([ '2021-01-01' , '2021-01-02' , '2021-01-03' , '2021-01-04' , '2021-01-05' , '2021-01-06' , '2021-01-07' , '2021-01-08' , '2021-01-09' , '2021-01-10' , '2021-01-11' , '2021-01-12' , '2021-01-13' , '2021-01-14' , '2021-01-15' , '2021-01-16' , '2021-01-17' , '2021-01-18' , '2021-01-19' , '2021-01-20' ], dtype = 'datetime64[ns]' , freq = 'D' ) # pd.to_datetime() df = pd.DataFrame(data = range ( 20210101 , 20210128 ),columns = [ "period" ]) df[ "aa" ] = pd.to_datetime(df[ "period" ], format = "%Y%m%d" ) df Out[ 24 ]: period aa 0 20210101 2021 - 01 - 01 1 20210102 2021 - 01 - 02 2 20210103 2021 - 01 - 03 3 20210104 2021 - 01 - 04 4 20210105 2021 - 01 - 05 5 20210106 2021 - 01 - 06 6 20210107 2021 - 01 - 07 7 20210108 2021 - 01 - 08 8 20210109 2021 - 01 - 09 9 20210110 2021 - 01 - 10 10 20210111 2021 - 01 - 11 11 20210112 2021 - 01 - 12 12 20210113 2021 - 01 - 13 13 20210114 2021 - 01 - 14 14 20210115 2021 - 01 - 15 15 20210116 2021 - 01 - 16 16 20210117 2021 - 01 - 17 17 20210118 2021 - 01 - 18 18 20210119 2021 - 01 - 19 19 20210120 2021 - 01 - 20 20 20210121 2021 - 01 - 21 21 20210122 2021 - 01 - 22 22 20210123 2021 - 01 - 23 23 20210124 2021 - 01 - 24 24 20210125 2021 - 01 - 25 25 20210126 2021 - 01 - 26 26 20210127 2021 - 01 - 27 index[ 1 ] Out[ 30 ]: Timestamp( '2021-01-02 00:00:00' , freq = 'D' ) df[ "aa" ][ 1 ] Out[ 31 ]: Timestamp( '2021-01-02 00:00:00' ) df[ "aa" ][ 1 ] = = index[ 1 ] Out[ 32 ]: True type (df[ "aa" ][ 1 ]) Out[ 33 ]: pandas._libs.tslibs.timestamps.Timestamp type (index[ 1 ]) Out[ 34 ]: pandas._libs.tslibs.timestamps.Timestamp |
Timestamp与datetime
从上面代码可以看出,pandas中的时间格式是pandas._libs.tslibs.timestamps.Timestamp
但是python中常用的时间格式是datetime.datetime
-
to_pydatetime()
1
2
3
4
5
6
7
8
9
10
|
t = datetime( 2021 , 1 , 2 ) type (t) Out[ 54 ]: datetime.datetime t Out[ 55 ]: datetime.datetime( 2021 , 1 , 2 , 0 , 0 ) r = (index[ 1 ].to_pydatetime()) type (r) Out[ 57 ]: datetime.datetime t = = r Out[ 58 ]: True |
将pandas Timestamp 转为 datetime 类型
1
2
3
|
In [ 11 ]: ts = pd.Timestamp( '2014-01-23 00:00:00' , tz = None ) In [ 12 ]: ts.to_pydatetime() Out[ 12 ]: datetime.datetime( 2014 , 1 , 23 , 0 , 0 ) |
1
2
3
4
5
6
7
|
It's also available on a DatetimeIndex rng = pd.date_range( '1/10/2011' , periods = 3 , freq = 'D' ) rng.to_pydatetime() Out[ 60 ]: array([datetime.datetime( 2011 , 1 , 10 , 0 , 0 ), datetime.datetime( 2011 , 1 , 11 , 0 , 0 ), datetime.datetime( 2011 , 1 , 12 , 0 , 0 )], dtype = object ) |
pandas从Timestamp中提取小时分钟等
官方文档: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#from-timestamps-to-epoch
最近需要提取某一天的时刻距离0:00的分钟数,找了文档之后想到这样一个办法:
假设数据为
1
2
3
4
5
6
|
In [ 64 ]: stamps = pd.date_range( '2012-10-08 18:15:05' , periods = 4 , freq = 'h' ) In [ 65 ]: stamps Out[ 65 ]: DatetimeIndex([ '2012-10-08 18:15:05' , '2012-10-08 19:15:05' , '2012-10-08 20:15:05' , '2012-10-08 21:15:05' ], dtype = 'datetime64[ns]' , freq = 'D' ) |
先得到距离1970-01-01的秒数
1
2
|
In [ 66 ]: (stamps - pd.Timestamp( "1970-01-01" )) / / pd.Timedelta( '1s' ) Out[ 66 ]: Int64Index([ 1349720105 , 1349723705 , 1349727305 , 1349730905 ], dtype = 'int64' ) |
对天取余,得到距离0:00的秒数
1
2
|
In [ 67 ]: (stamps - pd.Timestamp( "1970-01-01" )) / / pd.Timedelta( '1s' ) % 86400 Out[ 67 ]: Int64Index([ 65705 , 69305 , 72905 , 76505 ], dtype = 'int64' ) |
取距离0:00的分钟数
1
2
3
|
In [ 68 ]: (stamps - pd.Timestamp( "1970-01-01" )) / / pd.Timedelta( '1s' ) % 86400 / 60 Out[ 68 ]: Int64Index([ 1095.0833333333333 , 1155.0833333333333 , 1215.0833333333333 , 1275.0833333333333 ], dtype = 'float64' ) |
同样的,也可以取小时数
1
2
3
|
In [ 69 ]: (stamps - pd.Timestamp( "1970-01-01" )) / / pd.Timedelta( '1s' ) % 86400 / 3600 Out[ 68 ]: Int64Index([ 18.25138888888889 , 19.25138888888889 , 20.25138888888889 , 21.25138888888889 ], dtype = 'float64' ) |
取小时整数–当然取小时整数也有别的方法。
1
2
|
In [ 70 ]: (stamps - pd.Timestamp( "1970-01-01" )) / / pd.Timedelta( '1s' ) % 86400 / / 3600 Out[ 70 ]: Int64Index([ 18 , 19 , 20 , 21 ], dtype = 'int64' ) |
以上为个人经验,希望能给大家一个参考,也希望大家多多支持服务器之家。
原文链接:https://blog.csdn.net/qq_34184505/article/details/124380393