@@ -43,12 +43,16 @@ Creating a :class:`DataFrame` by passing a dict of objects that can be converted
43
43
44
44
.. ipython :: python
45
45
46
- df2 = pd.DataFrame({' A' : 1 .,
47
- ' B' : pd.Timestamp(' 20130102' ),
48
- ' C' : pd.Series(1 , index = list (range (4 )), dtype = ' float32' ),
49
- ' D' : np.array([3 ] * 4 , dtype = ' int32' ),
50
- ' E' : pd.Categorical([" test" , " train" , " test" , " train" ]),
51
- ' F' : ' foo' })
46
+ df2 = pd.DataFrame(
47
+ {
48
+ " A" : 1.0 ,
49
+ " B" : pd.Timestamp(" 20130102" ),
50
+ " C" : pd.Series(1 , index = list (range (4 )), dtype = " float32" ),
51
+ " D" : np.array([3 ] * 4 , dtype = " int32" ),
52
+ " E" : pd.Categorical([" test" , " train" , " test" , " train" ]),
53
+ " F" : " foo" ,
54
+ }
55
+ )
52
56
df2
53
57
54
58
The columns of the resulting :class: `DataFrame ` have different
@@ -512,12 +516,14 @@ See the :ref:`Grouping section <groupby>`.
512
516
513
517
.. ipython :: python
514
518
515
- df = pd.DataFrame({' A' : [' foo' , ' bar' , ' foo' , ' bar' ,
516
- ' foo' , ' bar' , ' foo' , ' foo' ],
517
- ' B' : [' one' , ' one' , ' two' , ' three' ,
518
- ' two' , ' two' , ' one' , ' three' ],
519
- ' C' : np.random.randn(8 ),
520
- ' D' : np.random.randn(8 )})
519
+ df = pd.DataFrame(
520
+ {
521
+ " A" : [" foo" , " bar" , " foo" , " bar" , " foo" , " bar" , " foo" , " foo" ],
522
+ " B" : [" one" , " one" , " two" , " three" , " two" , " two" , " one" , " three" ],
523
+ " C" : np.random.randn(8 ),
524
+ " D" : np.random.randn(8 ),
525
+ }
526
+ )
521
527
df
522
528
523
529
Grouping and then applying the :meth: `~pandas.core.groupby.GroupBy.sum ` function to the resulting
@@ -545,10 +551,14 @@ Stack
545
551
546
552
.. ipython :: python
547
553
548
- tuples = list (zip (* [[' bar' , ' bar' , ' baz' , ' baz' ,
549
- ' foo' , ' foo' , ' qux' , ' qux' ],
550
- [' one' , ' two' , ' one' , ' two' ,
551
- ' one' , ' two' , ' one' , ' two' ]]))
554
+ tuples = list (
555
+ zip (
556
+ * [
557
+ [" bar" , " bar" , " baz" , " baz" , " foo" , " foo" , " qux" , " qux" ],
558
+ [" one" , " two" , " one" , " two" , " one" , " two" , " one" , " two" ],
559
+ ]
560
+ )
561
+ )
552
562
index = pd.MultiIndex.from_tuples(tuples, names = [' first' , ' second' ])
553
563
df = pd.DataFrame(np.random.randn(8 , 2 ), index = index, columns = [' A' , ' B' ])
554
564
df2 = df[:4 ]
@@ -578,11 +588,15 @@ See the section on :ref:`Pivot Tables <reshaping.pivot>`.
578
588
579
589
.. ipython :: python
580
590
581
- df = pd.DataFrame({' A' : [' one' , ' one' , ' two' , ' three' ] * 3 ,
582
- ' B' : [' A' , ' B' , ' C' ] * 4 ,
583
- ' C' : [' foo' , ' foo' , ' foo' , ' bar' , ' bar' , ' bar' ] * 2 ,
584
- ' D' : np.random.randn(12 ),
585
- ' E' : np.random.randn(12 )})
591
+ df = pd.DataFrame(
592
+ {
593
+ " A" : [" one" , " one" , " two" , " three" ] * 3 ,
594
+ " B" : [" A" , " B" , " C" ] * 4 ,
595
+ " C" : [" foo" , " foo" , " foo" , " bar" , " bar" , " bar" ] * 2 ,
596
+ " D" : np.random.randn(12 ),
597
+ " E" : np.random.randn(12 ),
598
+ }
599
+ )
586
600
df
587
601
588
602
We can produce pivot tables from this data very easily:
@@ -653,8 +667,10 @@ pandas can include categorical data in a :class:`DataFrame`. For full docs, see
653
667
654
668
.. ipython :: python
655
669
656
- df = pd.DataFrame({" id" : [1 , 2 , 3 , 4 , 5 , 6 ],
657
- " raw_grade" : [' a' , ' b' , ' b' , ' a' , ' a' , ' e' ]})
670
+ df = pd.DataFrame(
671
+ {" id" : [1 , 2 , 3 , 4 , 5 , 6 ], " raw_grade" : [" a" , " b" , " b" , " a" , " a" , " e" ]}
672
+ )
673
+
658
674
659
675
Convert the raw grades to a categorical data type.
660
676
@@ -674,8 +690,9 @@ Reorder the categories and simultaneously add the missing categories (methods un
674
690
675
691
.. ipython :: python
676
692
677
- df[" grade" ] = df[" grade" ].cat.set_categories([" very bad" , " bad" , " medium" ,
678
- " good" , " very good" ])
693
+ df[" grade" ] = df[" grade" ].cat.set_categories(
694
+ [" very bad" , " bad" , " medium" , " good" , " very good" ]
695
+ )
679
696
df[" grade" ]
680
697
681
698
Sorting is per order in the categories, not lexical order.
@@ -705,8 +722,7 @@ We use the standard convention for referencing the matplotlib API:
705
722
706
723
.. ipython :: python
707
724
708
- ts = pd.Series(np.random.randn(1000 ),
709
- index = pd.date_range(' 1/1/2000' , periods = 1000 ))
725
+ ts = pd.Series(np.random.randn(1000 ), index = pd.date_range(" 1/1/2000" , periods = 1000 ))
710
726
ts = ts.cumsum()
711
727
712
728
@savefig series_plot_basic.png
@@ -717,8 +733,10 @@ of the columns with labels:
717
733
718
734
.. ipython :: python
719
735
720
- df = pd.DataFrame(np.random.randn(1000 , 4 ), index = ts.index,
721
- columns = [' A' , ' B' , ' C' , ' D' ])
736
+ df = pd.DataFrame(
737
+ np.random.randn(1000 , 4 ), index = ts.index, columns = [" A" , " B" , " C" , " D" ]
738
+ )
739
+
722
740
df = df.cumsum()
723
741
724
742
plt.figure()
0 commit comments