{"id":23655,"date":"2021-11-29T09:26:09","date_gmt":"2021-11-29T09:26:09","guid":{"rendered":"https:\/\/www.askpython.com\/?p=23655"},"modified":"2021-11-29T10:19:06","modified_gmt":"2021-11-29T10:19:06","slug":"pandas-shape-attribute","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python-modules\/pandas\/pandas-shape-attribute","title":{"rendered":"The Pandas Shape Attribute &#8211; A Complete Guide"},"content":{"rendered":"\n<p>Pandas is an extensive library for external data preprocessing and internal dataset creation. It is one of the main packages that help in preprocessing information and cleaning it for better use. <\/p>\n\n\n\n<p>The best feature is that it enables to read and fetch a large amount of data from the servers. <\/p>\n\n\n\n<p>This helps a lot better in Python&#8217;s web scraping and collection of critical points online. This article speaks about one of the notable features of this module which is\u00a0<strong>The Panda&#8217;s Shape Attribute.<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Prerequisites<\/h3>\n\n\n\n<p>Before we start the main thing is we need to check out tools and weapons for this game. So, let us make sure of it.<\/p>\n\n\n\n<p><strong>Tools and technologies:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\"><li><strong>Python: <em>version 3.6 or above<\/em><\/strong><\/li><li><strong>IDE: <em><a href=\"https:\/\/www.askpython.com\/python\/jupyter-notebook-for-python\" data-type=\"post\" data-id=\"12648\">Jupyter Notebooks<\/a><\/em><\/strong> <\/li><li><strong>Browser: <em>Google Chrome<\/em><\/strong><\/li><li><strong>Environment: <em><a href=\"https:\/\/www.askpython.com\/python-modules\/python-anaconda-tutorial\" data-type=\"post\" data-id=\"10679\">Anaconda<\/a><\/em><\/strong><\/li><li><strong>Supportive packages: <em><a href=\"https:\/\/www.askpython.com\/python-modules\/numpy\/python-numpy-module\" data-type=\"post\" data-id=\"7694\">Numpy<\/a> and <a href=\"https:\/\/www.askpython.com\/python-modules\/matplotlib\/python-matplotlib\" data-type=\"post\" data-id=\"3182\">Matplotlib<\/a><\/em><\/strong><\/li><li><em>A stable internet connection (necessary only to read data from the server)<\/em>.<\/li><\/ol>\n\n\n\n<p>Also we will make sure what are we going to cover in this article:<\/p>\n\n\n\n<p><strong>What we&#8217;ll cover in this article:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>What is the shape attribute in Pandas<\/li><li>Reading a dataset<\/li><li>Using <strong>shape <\/strong>in that dataset<\/li><\/ol>\n\n\n\n<p>Now we are ready for this action so let us jump right in!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is the shape attribute in Pandas?<\/h2>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"464\" height=\"375\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/General-format-of-a-table-1.png\" alt=\"General Format Of A Table 1\" class=\"wp-image-23808\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/General-format-of-a-table-1.png 464w, https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/General-format-of-a-table-1-300x242.png 300w\" sizes=\"auto, (max-width: 464px) 100vw, 464px\" \/><figcaption>General Format Of Table 1<\/figcaption><\/figure><\/div>\n\n\n\n<p>A data frame is the actual representation of information about a specific topic. This can be from various data streams and industry sections. Probably every individual and organization from particular sectors in this world of modernization maintains critical data. Its principal or major format is Tabular. But this tabular data is in various extensions like SQL, Excel, JSON, etc. The below image shows the actual picture:<\/p>\n\n\n\n<p> It can be either small or large. In most cases, the datasheet is very larger than we expect. Thus, some human mistakes may happen while taking into the record the count of rows and columns. <\/p>\n\n\n\n<p><strong>So, to tackle this difficulty, the shape attribute in the pandas library is for checking the actual number of rows and columns inside a dataset or a data frame. <\/strong><\/p>\n\n\n\n<p><strong>Syntax to read any dataset&#8217;s shape<\/strong> &#8211; This is the general syntax to read the shape of the dataset:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndataframe.shape\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\">Reading a dataset in Pandas<\/h2>\n\n\n\n<p>The dataset reading is bringing into the picture what actually exists inside it. This is performed using the <strong>read <\/strong>function in Pandas. It has different forms for different file extensions. We will read <strong>three <\/strong>datasets to check each one&#8217;s shape.<\/p>\n\n\n\n<p><strong>Datasets used:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\"><li><strong><em>Test_set.csv<\/em><\/strong><\/li><li><strong><em>salary.csv<\/em><\/strong><\/li><li><strong><em>titanic.csv<\/em><\/strong><\/li><\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">General syntax to read a dataset:<\/h3>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport pandas as pd\ndata_variable = pd.read_file(&#039;filename.extension&#039;) \n\n# The read_file method is an example. There are different methods for each file extension.\n<\/pre><\/div>\n\n\n<h4 class=\"wp-block-heading\">Dataset 1<\/h4>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"376\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/Image-of-dataset-1-1024x376.png\" alt=\"Image Of Dataset 1\" class=\"wp-image-23804\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/Image-of-dataset-1-1024x376.png 1024w, https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/Image-of-dataset-1-300x110.png 300w, https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/Image-of-dataset-1-768x282.png 768w, https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/Image-of-dataset-1.png 1107w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Reading dataset 1 and retrieving its shape<\/figcaption><\/figure><\/div>\n\n\n\n<p>In the above image, we can see how the shape attribute works. It returns a tuple that has two values. Remember that the first value denotes the number of rows and the second value denotes the number of columns. In short, this tells us that the dataset is much larger. <strong>It has 2,671 rows and 10 columns<\/strong>.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Dataset 2<\/h3>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"561\" height=\"313\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/reading-dataset-2.png\" alt=\"Reading Dataset 2\" class=\"wp-image-23806\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/reading-dataset-2.png 561w, https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/reading-dataset-2-300x167.png 300w\" sizes=\"auto, (max-width: 561px) 100vw, 561px\" \/><figcaption>Reading dataset 2 and retrieving its shape<\/figcaption><\/figure><\/div>\n\n\n\n<p>Its name is <strong>salary.csv<\/strong> this dataset shape is (16, 4). Thus it has 16 rows and 4 columns.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Dataset 3<\/h3>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"313\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/reading-dataset-3-1024x313.png\" alt=\"Reading Dataset 3\" class=\"wp-image-23810\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/reading-dataset-3-1024x313.png 1024w, https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/reading-dataset-3-300x92.png 300w, https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/reading-dataset-3-768x235.png 768w, https:\/\/www.askpython.com\/wp-content\/uploads\/2021\/11\/reading-dataset-3.png 1062w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Reading dataset 3<\/figcaption><\/figure><\/div>\n\n\n\n<p>This dataset is titanic.csv. From the shape attribute, we can see that it has <strong>418 rows and  12 columns present in this dataset<\/strong>. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Some different ways to use the shape attribute<\/h3>\n\n\n\n<p>Now that we came to know how to use <strong>shape <\/strong>through these three examples. There are some notable key points that we can make use of for this attribute.<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li><strong>To retrieve only row count.<\/strong><\/li><li><strong>To retrieve only column count.<\/strong><\/li><\/ol>\n\n\n\n<p>As we know that it returns a tuple of rows, columns. So, we can use <strong>index <\/strong>slicing for this. tuples are immutable but, the elements are accessible through indexing methods. It is the same as we do with the lists. Let us see with a codebase example:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ntupple_1 = (12, 42, 45,90)\n\ntuple_1&#x5B;3]\ntuple_1&#x5B;0]\ntuple_1&#x5B;1]\n\n# Output\n# 90\n# 12\n# 42\n\n<\/pre><\/div>\n\n\n<p><strong>To retrieve row count access the zeroth index and for the column count access the first index<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndata.shape&#x5B;0] # returns number of rows\ndata.shape&#x5B;1] # returns number of columns\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>This is how the shape attribute performs in Pandas. It is a very important and one of the key functions that we use for the data preprocessing. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pandas is an extensive library for external data preprocessing and internal dataset creation. It is one of the main packages that help in preprocessing information and cleaning it for better use. The best feature is that it enables to read and fetch a large amount of data from the servers. This helps a lot better [&hellip;]<\/p>\n","protected":false},"author":36,"featured_media":23814,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[94],"tags":[],"class_list":["post-23655","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-pandas"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/23655","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/36"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=23655"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/23655\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media\/23814"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=23655"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=23655"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=23655"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}