0% found this document useful (0 votes)
33 views6 pages

E-commerce Order Data Analysis

The dataset contains 115,609 entries and 14 columns, with various data types including objects, floats, and integers. There are missing values in three columns: order_approved_at, order_delivered_carrier_date, and order_delivered_customer_date, with percentages of 0.012%, 1.034%, and 2.076% respectively. The order_status indicates that the majority of orders (113,210) were delivered, while the product_category_name_english shows a wide range of categories, with 'bed_bath_table' being the most common at 11,847 entries.

Uploaded by

khoa1903204
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views6 pages

E-commerce Order Data Analysis

The dataset contains 115,609 entries and 14 columns, with various data types including objects, floats, and integers. There are missing values in three columns: order_approved_at, order_delivered_carrier_date, and order_delivered_customer_date, with percentages of 0.012%, 1.034%, and 2.076% respectively. The order_status indicates that the majority of orders (113,210) were delivered, while the product_category_name_english shows a wide range of categories, with 'bed_bath_table' being the most common at 11,847 entries.

Uploaded by

khoa1903204
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Number of rows: 115609

Number of columns: 14
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 115609 entries, 0 to 115608
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 order_id 115609 non-null object
1 customer_unique_id 115609 non-null object
2 order_status 115609 non-null object
3 order_purchase_timestamp 115609 non-null object
4 order_approved_at 115595 non-null object
5 order_delivered_carrier_date 114414 non-null object
6 order_delivered_customer_date 113209 non-null object
7 order_estimated_delivery_date 115609 non-null object
8 order_item_id 115609 non-null int64
9 product_id 115609 non-null object
10 price 115609 non-null float64
11 payment_value 115609 non-null float64
12 review_score 115609 non-null int64
13 product_category_name_english 115609 non-null object
dtypes: float64(2), int64(2), object(10)
memory usage: 12.3+ MB
None

Missing data percentage per column:


order_id 0.000000
customer_unique_id 0.000000
order_status 0.000000
order_purchase_timestamp 0.000000
order_approved_at 0.012110
order_delivered_carrier_date 1.033657
order_delivered_customer_date 2.075963
order_estimated_delivery_date 0.000000
order_item_id 0.000000
product_id 0.000000
price 0.000000
payment_value 0.000000
review_score 0.000000
product_category_name_english 0.000000
dtype: float64
Columns with missing data:
order_approved_at 0.012110
order_delivered_carrier_date 1.033657
order_delivered_customer_date 2.075963
dtype: float64

Value counts for order_status:


order_status
delivered 113210
shipped 1138
canceled 536
invoiced 358
processing 357
unavailable 7
approved 3
Name: count, dtype: int64

Value counts for product_category_name_english:


product_category_name_english
bed_bath_table 11847
health_beauty 9944
sports_leisure 8942
furniture_decor 8743
computers_accessories 8105
...
arts_and_craftmanship 24
la_cuisine 15
cds_dvds_musicals 14
fashion_childrens_clothes 8
security_and_services 2
Name: count, Length: 71, dtype: int64

You might also like