Row Selection with DataFrames
One of the most important activities when working with data is to select a subset of the rows. Below is the full Course Catalog dataset, displaying all 8,589 courses offered in Fall 2019 at The University of Illinois. When Python displays the contents of a large DataFrame, it will show three things:
- The first five rows of data,
- A row of
...
, and - The last five rows.
An example of displaying the full Course Catalog dataset:
import pandas as pd
df = pd.read_csv("https://waf.cs.illinois.edu/discovery/course-catalog.csv")
df
Year | Term | YearTerm | Subject | Number | Name | Description | Credit Hours | Section Info | Degree Attributes | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 2019 | Fall | 2019-fa | AAS | 100 | Intro Asian American Studies | Interdisciplinary introduction to the basic co... | 3 hours. | NaN | Social & Beh Sci - Soc Sci, and Cultural Studi... |
1 | 2019 | Fall | 2019-fa | AAS | 105 | Introduction to Arab American Studies | Interdisciplinary introduction to the basic co... | 3 hours. | NaN | Cultural Studies - US Minority course. |
2 | 2019 | Fall | 2019-fa | AAS | 120 | Intro to Asian Am Pop Culture | Introductory understanding of the way U.S. pop... | 3 hours. | NaN | Cultural Studies - US Minority course. |
3 | 2019 | Fall | 2019-fa | AAS | 199 | Undergraduate Open Seminar | May be repeated to a maximum of 6 hours. | 1 TO 5 hours. | NaN | NaN |
4 | 2019 | Fall | 2019-fa | AAS | 200 | U.S. Race and Empire | Invites students to examine histories and narr... | 3 hours. | Same as LLS 200. | Humanities - Hist & Phil, and Cultural Studies... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
8584 | 2019 | Fall | 2019-fa | ZULU | 202 | Elementary Zulu II | Continuation of ZULU 201 with introduction of ... | 5 hours. | Same as AFST 252. Participation in the languag... | NaN |
8585 | 2019 | Fall | 2019-fa | ZULU | 403 | Intermediate Zulu I | Survey of more advanced grammar; emphasis on i... | 4 hours. | NaN | NaN |
8586 | 2019 | Fall | 2019-fa | ZULU | 404 | Intermediate Zulu II | Continuation of ZULU 403; emphasis on increasi... | 4 hours. | NaN | NaN |
8587 | 2019 | Fall | 2019-fa | ZULU | 405 | Advanced Zulu I | Third year Zulu with emphasis on conversationa... | 3 hours. | NaN | NaN |
8588 | 2019 | Fall | 2019-fa | ZULU | 406 | Advanced Zulu II | Continuation of Zulu 405 with increased emphas... | 3 hours. | NaN | NaN |
The full 2019 Course Catalog dataset stored in a DataFrame (8,589 rows).
What is Row Selection?
Row selection is any operation that selects only a subset of all the rows in a DataFrame. All of the following are examples of selecting a subset of rows from the Course Catalog dataset:
- Selecting all courses offered in the
STAT
department, - Selecting a random 1% sample of all courses offered at Illinois,
- Selecting five random courses offered at Illinois,
- Selecting the first ten courses from the dataset,
- Selecting the last five courses from the dataset,
- Selecting all courses with a
Number
in the 100s, or - Selecting all courses with a course number greater than or equal to 241.
Quick Reference: Pandas Cheat Sheet
The pandas project provides an official "cheat sheet" of incredibly useful commands that can be found at https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf. This sheet is incredibly useful, and many data scientists even have one printed out at their workspace!


We will explore areas of this cheat sheet so you can become familiar with it as a reference sheet. As you do more and more data science, you will find you will start to memorize the commands you use most often.
Pandas Cheat Sheet Section: "Subset Observation (Rows)"
The section in the bottom-center of the pandas cheat sheet is titled "Subset Observations (Rows)" and provides us with the most common functions to select a subset of the rows (also called observations) from our data:


Row Selection Based on the Contents of the Row
One of the most useful forms of row selection is to select only rows that match a specific criteria.
To select the rows that match a criteria, we need a conditional and to follow the first example given on the cheat sheet: df[df.Length > 7]
. This line of code has four components:
The outer of
df[ ___________ ]
specifies that we will be selecting data from the DataFrame nameddf
. The "square brackets" are necessary.The inner
df.Length
specifies that we want to look at the column namedLength
inside of the DataFrame nameddf
.The
>
is a logical operation meaning "greater than". This is one of six common options that include:>
, for strictly greater than,>=
, for greater than or equal to<
, for strictly less than,<=
, for less than or equal to,==
, for exactly equal to, and!=
, for NOT exactly equal to,
The
7
is value we are checking the data against, specifically checking if the data in each row is greater than 7.- Since
7
is a number, quotes are NOT used. - If we had a string like
"Discovery"
, quotes MUST be used.
- Since
Example: Selecting the Courses Taught in Statistics
Let's use this syntax to select all courses taught by the STAT department. Referring to the dataset, the department that teaches the courses in the dataset is contained in the column named Subject
:
Year | Term | YearTerm | Subject | Number | Name | Description | Credit Hours | Section Info | Degree Attributes | |
---|---|---|---|---|---|---|---|---|---|---|
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Column headers of the Course Catalog dataset, with the Subject
column highlighted.
Therefore, we need to check if the contents of each row's Subject
data is equal to "STAT"
:
df[ df.Subject == "STAT" ]
Year | Term | YearTerm | Subject | Number | Name | Description | Credit Hours | Section Info | Degree Attributes | |
---|---|---|---|---|---|---|---|---|---|---|
8002 | 2019 | Fall | 2019-fa | STAT | 100 | Statistics | First course in probability and statistics at ... | 3 hours. | Credit is not given for both STAT 100 and any ... | Quantitative Reasoning I course. |
8003 | 2019 | Fall | 2019-fa | STAT | 107 | Data Science Discovery | Data Science Discovery is the intersection of ... | 4 hours. | Same as CS 107 and IS 107. | Quantitative Reasoning I course. |
8004 | 2019 | Fall | 2019-fa | STAT | 199 | Undergraduate Open Seminar | See course schedule for topics. Approved for L... | 1 TO 5 hours. | NaN | NaN |
8005 | 2019 | Fall | 2019-fa | STAT | 200 | Statistical Analysis | Survey of statistical concepts, data analysis,... | 3 hours. | Credit is not given for both STAT 200 and STAT... | Quantitative Reasoning I course. |
8006 | 2019 | Fall | 2019-fa | STAT | 212 | Biostatistics | Application of statistical reasoning and stati... | 3 hours. | Credit is not given for both STAT 212 and STAT... | Quantitative Reasoning I course. |
8007 | 2019 | Fall | 2019-fa | STAT | 361 | Probability & Statistics for Computer Science | Same as CS 361. See CS 361. | 3 hours. | Same as CS 361. See CS 361. | NaN |
8008 | 2019 | Fall | 2019-fa | STAT | 385 | Statistics Programming Methods | Statisticians must be savvy in programming met... | 3 hours. | Prerequisite: STAT 200 or STAT 212. | NaN |
8009 | 2019 | Fall | 2019-fa | STAT | 390 | Individual Study | May be repeated to a maximum of 8 hours. Prere... | 1 OR 2 hours. | May be repeated to a maximum of 8 hours. Prere... | NaN |
8010 | 2019 | Fall | 2019-fa | STAT | 391 | Honors Individual Study | May be repeated to a maximum of 8 hours. Prere... | 1 OR 2 hours. | May be repeated to a maximum of 8 hours. Prere... | NaN |
8011 | 2019 | Fall | 2019-fa | STAT | 400 | Statistics and Probability I | Introduction to mathematical statistics that d... | 4 hours. | Same as MATH 463. 4 undergraduate hours. 4 gra... | NaN |
8012 | 2019 | Fall | 2019-fa | STAT | 408 | Actuarial Statistics I | Examines elementary theory of probability, inc... | 4 hours. | NaN | NaN |
8013 | 2019 | Fall | 2019-fa | STAT | 409 | Actuarial Statistics II | Continuation of STAT 408. Examines parametric ... | 4 hours. | Same as ASRM 402. 4 undergraduate hours. 4 gra... | NaN |
8014 | 2019 | Fall | 2019-fa | STAT | 410 | Statistics and Probability II | Continuation of STAT 400. Includes moment-gene... | 3 OR 4 hours. | Same as MATH 464. 3 undergraduate hours. 4 gra... | NaN |
8015 | 2019 | Fall | 2019-fa | STAT | 420 | Methods of Applied Statistics | Systematic, calculus-based coverage of the mor... | 3 OR 4 hours. | Same as ASRM 450. 3 undergraduate hours. 4 gra... | NaN |
8016 | 2019 | Fall | 2019-fa | STAT | 424 | Analysis of Variance | Estimation and hypotheses testing in linear mo... | 3 OR 4 hours. | NaN | NaN |
8017 | 2019 | Fall | 2019-fa | STAT | 425 | Applied Regression and Design | Explores linear regression, least squares esti... | 3 OR 4 hours. | 3 undergraduate hours. 4 graduate hours. Prere... | NaN |
8018 | 2019 | Fall | 2019-fa | STAT | 426 | Sampling and Categorical Data | Sampling: simple random, stratified, systemati... | 3 OR 4 hours. | NaN | NaN |
8019 | 2019 | Fall | 2019-fa | STAT | 427 | Statistical Consulting | Students, working in groups under the supervis... | 3 OR 4 hours. | NaN | NaN |
8020 | 2019 | Fall | 2019-fa | STAT | 428 | Statistical Computing | Examines statistical packages, numerical analy... | 3 OR 4 hours. | Same as CSE 428. 3 undergraduate hours. 4 grad... | NaN |
8021 | 2019 | Fall | 2019-fa | STAT | 429 | Time Series Analysis | Studies theory and data analysis for time seri... | 3 OR 4 hours. | 3 undergraduate hours. 4 graduate hours. Prere... | NaN |
8022 | 2019 | Fall | 2019-fa | STAT | 430 | Topics in Applied Statistics | Formulation and analysis of mathematical model... | 3 OR 4 hours. | 3 undergraduate hours. 4 graduate hours. May b... | NaN |
8023 | 2019 | Fall | 2019-fa | STAT | 431 | Applied Bayesian Analysis | Introduction to the concepts and methodology o... | 3 OR 4 hours. | NaN | NaN |
8024 | 2019 | Fall | 2019-fa | STAT | 432 | Basics of Statistical Learning | Topics in supervised and unsupervised learning... | 3 OR 4 hours. | Same as ASRM 451. 3 undergraduate hours. 4 gra... | NaN |
8025 | 2019 | Fall | 2019-fa | STAT | 433 | Stochastic Processes | A stochastic process is a random process that ... | 3 OR 4 hours. | 3 undergraduate hours. 4 graduate hours. Prere... | NaN |
8026 | 2019 | Fall | 2019-fa | STAT | 434 | Survival Analysis | Introduction to the analysis of time-to-event ... | 3 OR 4 hours. | 3 undergraduate hours. 4 graduate hours. Prere... | NaN |
8027 | 2019 | Fall | 2019-fa | STAT | 440 | Statistical Data Management | The critical elements of data storage, data cl... | 3 OR 4 hours. | Same as CSE 440. 3 undergraduate hours. 4 grad... | NaN |
8028 | 2019 | Fall | 2019-fa | STAT | 443 | Professional Statistics | This project-based course emphasizes written, ... | 3 OR 4 hours. | 3 undergraduate hours. 4 graduate hours. Prere... | NaN |
8029 | 2019 | Fall | 2019-fa | STAT | 448 | Advanced Data Analysis | Several of the most widely used techniques of ... | 4 hours. | Same as CSE 448. 4 undergraduate hours. 4 grad... | NaN |
8030 | 2019 | Fall | 2019-fa | STAT | 458 | Math Modeling in Life Sciences | Same as ANSC 448 and IB 487. See ANSC 448. | 3 OR 4 hours. | NaN | NaN |
8031 | 2019 | Fall | 2019-fa | STAT | 480 | Data Science Foundations | Examines the methods of data management and an... | 3 OR 4 hours. | 3 undergraduate hours. 4 graduate hours. Prere... | NaN |
8032 | 2019 | Fall | 2019-fa | STAT | 510 | Mathematical Statistics I | Distributions, transformations, order-statisti... | 4 hours. | Prerequisite: STAT 410. | NaN |
8033 | 2019 | Fall | 2019-fa | STAT | 511 | Mathematical Statistics II | Bayes estimates, minimaxity, admissibility; ma... | 4 hours. | NaN | NaN |
8034 | 2019 | Fall | 2019-fa | STAT | 525 | Computational Statistics | Various topics, such as ridge regression; robu... | 4 hours. | NaN | NaN |
8035 | 2019 | Fall | 2019-fa | STAT | 527 | Advanced Regression Analysis | An advanced introduction to regression analysi... | 4 hours. | 4 graduate hours. No professional credit. Prer... | NaN |
8036 | 2019 | Fall | 2019-fa | STAT | 530 | Bioinformatics | Same as ANSC 543, CHBE 571, and MCB 571. See C... | 4 hours. | NaN | NaN |
8037 | 2019 | Fall | 2019-fa | STAT | 534 | Advanced Survival Analysis | Introduction to the analysis of time-to-event ... | 4 hours. | NaN | NaN |
8038 | 2019 | Fall | 2019-fa | STAT | 538 | Clinical Trials Methodology | The topics of the course focus on clinical tri... | 4 hours. | NaN | NaN |
8039 | 2019 | Fall | 2019-fa | STAT | 541 | Predictive Analytics | Same as ASRM 552. See ASRM 552. | 4 hours. | NaN | NaN |
8040 | 2019 | Fall | 2019-fa | STAT | 542 | Statistical Learning | Modern techniques of predictive modeling, clas... | 4 hours. | Same as ASRM 551 and CSE 542. 4 graduate hours... | NaN |
8041 | 2019 | Fall | 2019-fa | STAT | 543 | Appl. Multivariate Statistics | Same as CPSC 543. See CPSC 543. | 4 hours. | NaN | NaN |
8042 | 2019 | Fall | 2019-fa | STAT | 545 | Spatial Statistics | Theory and methods for analyzing univariate an... | 4 hours. | NaN | NaN |
8043 | 2019 | Fall | 2019-fa | STAT | 546 | Machine Learning in Data Science | Trains students to analyze large complex data ... | 4 hours. | NaN | NaN |
8044 | 2019 | Fall | 2019-fa | STAT | 551 | Theory of Probability I | Same as MATH 561. See MATH 561. | 4 hours. | NaN | NaN |
8045 | 2019 | Fall | 2019-fa | STAT | 552 | Theory of Probability II | Same as MATH 562. See MATH 562. | 4 hours. | Same as MATH 562. See MATH 562. | NaN |
8046 | 2019 | Fall | 2019-fa | STAT | 553 | Probability and Measure I | Measures and probabilities; integration and ex... | 4 hours. | Prerequisite: MATH 447 or consent of instructor. | NaN |
8047 | 2019 | Fall | 2019-fa | STAT | 554 | Probability and Measure II | Measure extensions, Lebesque-Stieltjes measure... | 4 hours. | NaN | NaN |
8048 | 2019 | Fall | 2019-fa | STAT | 555 | Applied Stochastic Processes | Same as MATH 564. See MATH 564. | 4 hours. | Same as MATH 564. See MATH 564. | NaN |
8049 | 2019 | Fall | 2019-fa | STAT | 558 | Risk Modeling and Analysis | Same as MATH 563. See MATH 563. | 4 hours. | Same as MATH 563. See MATH 563. | NaN |
8050 | 2019 | Fall | 2019-fa | STAT | 571 | Multivariate Analysis | Inference in multivariate statistical populati... | 4 hours. | Prerequisite: STAT 410 and MATH 415, or consen... | NaN |
8051 | 2019 | Fall | 2019-fa | STAT | 575 | Large Sample Theory | Limiting distribution of maximum likelihood es... | 4 hours. | NaN | NaN |
8052 | 2019 | Fall | 2019-fa | STAT | 578 | Topics in Statistics | May be repeated if topics vary. Prerequisite: ... | 4 hours. | May be repeated if topics vary. Prerequisite: ... | NaN |
8053 | 2019 | Fall | 2019-fa | STAT | 587 | Hierarchical Linear Models | Same as PSYC 587 and EPSY 587. See EPSY 587. | 4 hours. | NaN | NaN |
8054 | 2019 | Fall | 2019-fa | STAT | 588 | Covar Struct and Factor Models | Same as EPSY 588, PSYC 588, and SOC 588. See P... | 4 hours. | NaN | NaN |
8055 | 2019 | Fall | 2019-fa | STAT | 590 | Individual Study and Research | Directed reading and research. Approved for le... | 0 TO 8 hours. | Approved for letter and S/U grading. May be re... | NaN |
8056 | 2019 | Fall | 2019-fa | STAT | 593 | STAT Internship | Supervised, off-campus experience in a field i... | 0 TO 8 hours. | Approved for letter and S/U grading. Prerequis... | NaN |
8057 | 2019 | Fall | 2019-fa | STAT | 595 | Preparing Future Faculty | Prepares Ph.D. students who are interested in ... | 2 hours. | NaN | NaN |
8058 | 2019 | Fall | 2019-fa | STAT | 599 | Thesis Research | Approved for S/U grading only. May be repeated... | 0 TO 16 hours. | Approved for S/U grading only. May be repeated... | NaN |
The subset of rows where the Subject
is exactly equal to STAT
(57 rows).
Notice that the first row has an index of 8002 and the subject is STAT for ALL of the rows! 🎉
A few important things to note:
We are checking if the subject is exactly equal to to the string
"STAT"
. To check if something is exactly equal to another value in Python, we use==
or "double equals".The data we're checking is NOT all numbers, so the data type of the subject names are strings. This requires that the value is contained in quotes (either
'STAT'
or"STAT"
).The column name (
Subject
) is also a string, but the use of the "dot notation" is a shortcut and one of the few exceptions where a string does not need quotes. If we wanted to be more explicit, the longer way to write this same statement is:df[ df["Subject"] == "STAT" ]
. This style will be required if the column name contains a space.
Example: Selecting all Graduate / Professional Courses
At the University of Illinois, courses are specifically graduate or professional courses if their course number is at least 500 (specifically 500-799, but there are no courses higher than that).
The column name that contains the course number is Number
and we can select the rows where the course number is greater than or equal to 500:
# Dot Notation:
df[ df.Number >= 500 ]
Year | Term | YearTerm | Subject | Number | Name | Description | Credit Hours | Section Info | Degree Attributes | |
---|---|---|---|---|---|---|---|---|---|---|
44 | 2019 | Fall | 2019-fa | AAS | 501 | Theory and Methods in AAS | Foundational gateway course for graduate study... | 4 hours. | NaN | NaN |
45 | 2019 | Fall | 2019-fa | AAS | 539 | Youth, Culture and Society | Same as EPS 539 and HDFS 539. See HDFS 539. | 4 hours. | NaN | NaN |
46 | 2019 | Fall | 2019-fa | AAS | 561 | Race and Cultural Critique | Introduction to graduate level theoretical and... | 4 hours. | NaN | NaN |
47 | 2019 | Fall | 2019-fa | AAS | 589 | Readings in Asian Am Studies | Individual guidance in intensive readings in t... | 1 TO 4 hours. | NaN | NaN |
48 | 2019 | Fall | 2019-fa | AAS | 590 | Asian Am Studies Seminar | Approved for letter and S/U grading. May be re... | 2 TO 4 hours. | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
8561 | 2019 | Fall | 2019-fa | VM | 635 | Veterinary Medical Spanish | In this course second year veterinary students... | 2 hours. | No graduate credit. 2 professional hours. Appr... | NaN |
8562 | 2019 | Fall | 2019-fa | VM | 642 | Contemporary Issues in Vet Med | An introductory course for first year veterina... | 1 hours. | NaN | NaN |
8563 | 2019 | Fall | 2019-fa | VM | 643 | Fundamentals of Management | An introductory course for second year veterin... | 1 hours. | No graduate credit. 1 professional hour. Appro... | NaN |
8564 | 2019 | Fall | 2019-fa | VM | 645 | Communications in Practice | An introductory course for third year veterina... | 1 hours. | NaN | NaN |
8565 | 2019 | Fall | 2019-fa | VM | 694 | Veterinary Medicine | To be used to designate a trial or experimenta... | 1 TO 4 hours. | NaN | NaN |
The subset of rows at the graduate or professional level (2,773 total rows).
# Standard Notation:
df[ df['Number'] >= 500 ]
Year | Term | YearTerm | Subject | Number | Name | Description | Credit Hours | Section Info | Degree Attributes | |
---|---|---|---|---|---|---|---|---|---|---|
44 | 2019 | Fall | 2019-fa | AAS | 501 | Theory and Methods in AAS | Foundational gateway course for graduate study... | 4 hours. | NaN | NaN |
45 | 2019 | Fall | 2019-fa | AAS | 539 | Youth, Culture and Society | Same as EPS 539 and HDFS 539. See HDFS 539. | 4 hours. | NaN | NaN |
46 | 2019 | Fall | 2019-fa | AAS | 561 | Race and Cultural Critique | Introduction to graduate level theoretical and... | 4 hours. | NaN | NaN |
47 | 2019 | Fall | 2019-fa | AAS | 589 | Readings in Asian Am Studies | Individual guidance in intensive readings in t... | 1 TO 4 hours. | NaN | NaN |
48 | 2019 | Fall | 2019-fa | AAS | 590 | Asian Am Studies Seminar | Approved for letter and S/U grading. May be re... | 2 TO 4 hours. | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
8561 | 2019 | Fall | 2019-fa | VM | 635 | Veterinary Medical Spanish | In this course second year veterinary students... | 2 hours. | No graduate credit. 2 professional hours. Appr... | NaN |
8562 | 2019 | Fall | 2019-fa | VM | 642 | Contemporary Issues in Vet Med | An introductory course for first year veterina... | 1 hours. | NaN | NaN |
8563 | 2019 | Fall | 2019-fa | VM | 643 | Fundamentals of Management | An introductory course for second year veterin... | 1 hours. | No graduate credit. 1 professional hour. Appro... | NaN |
8564 | 2019 | Fall | 2019-fa | VM | 645 | Communications in Practice | An introductory course for third year veterina... | 1 hours. | NaN | NaN |
8565 | 2019 | Fall | 2019-fa | VM | 694 | Veterinary Medicine | To be used to designate a trial or experimenta... | 1 TO 4 hours. | NaN | NaN |
The subset of rows at the graduate or professional level (2,773 total rows). We can see that the output is the same for both statements.
Random Row Selection
In many areas of data science, it will be important to divide the data into random groups or randomly sample a subset of the collected data. In both cases, we are taking a "random sample" of our data. The pandas library provides two methods to randomly sample the data:
Sample a fraction of the data by using
df.sample()
with afrac
parameter:- Ex:
df.sample(frac=0.5)
samples 50% (0.5) of our data randomly. - Ex:
df.sample(frac=0.01)
samples 1% (0.01) of our data randomly.
- Ex:
Sample a specific number of rows/observations from the data by using
df.sample()
with an
parameter (or no parameter at all!):- Ex:
df.sample(n=10)
samples exactly 10 rows randomly. - Ex:
df.sample(n=107)
samples exactly 107 rows randomly. - Ex:
df.sample(10)
also samples exactly 10 rows randomly (then
is not required, butfrac
is required).
- Ex:
Since this is a random sample, every time you run df.sample
the result will be different!
Example: A Random Course Schedule of Five Courses
An average student at the University of Illinois will take around five courses each semester. Using the Course Catalog dataset, we can randomly pick five courses to create a random schedule:
df.sample(n=5)
Year | Term | YearTerm | Subject | Number | Name | Description | Credit Hours | Section Info | Degree Attributes | |
---|---|---|---|---|---|---|---|---|---|---|
3665 | 2019 | Fall | 2019-fa | FIN | 520 | Financial Management | Introduction to financial management and decis... | 4 hours. | Prerequisite: Enrollment in the Executive MBA,... | NaN |
3523 | 2019 | Fall | 2019-fa | ESE | 100 | Sustainable Earth | Same as GCL 124. See GCL 124. | 3 hours. | NaN | Nat Sci & Tech - Phys Sciences course. |
1096 | 2019 | Fall | 2019-fa | ARTF | 106 | Visualization Drawing | This studio course introduces students to the ... | 3 hours. | Additional fees may apply. See Class Schedule.... | NaN |
8558 | 2019 | Fall | 2019-fa | VM | 623 | Research Project II | In this lecture/discussion course, which is a ... | 2 hours. | NaN | NaN |
3261 | 2019 | Fall | 2019-fa | ENGL | 486 | History of Translation | Same as CLCV 430, CWL 430, GER 405, SLAV 430, ... | 3 OR 4 hours. | NaN | NaN |
A random selection of five rows (5 total rows).
Running the exact same code again, our results are completely different since the selection is random:
df.sample(n=5)
Year | Term | YearTerm | Subject | Number | Name | Description | Credit Hours | Section Info | Degree Attributes | |
---|---|---|---|---|---|---|---|---|---|---|
1093 | 2019 | Fall | 2019-fa | ARTF | 103 | Design I | Theory and practice in the elements, processes... | 3 hours. | NaN | NaN |
463 | 2019 | Fall | 2019-fa | AFST | 231 | Elementary Swahili I | Same as SWAH 201. See SWAH 201. | 5 hours. | Same as SWAH 201. See SWAH 201. | NaN |
4182 | 2019 | Fall | 2019-fa | GWS | 325 | Lesbian/Queer Media Cultures | Discusses how various LGBT/Q communities were ... | 3 hours. | NaN | NaN |
7791 | 2019 | Fall | 2019-fa | SOCW | 200 | Introduction to Social Work | Broad survey of the field of social work; intr... | 3 hours. | NaN | Social & Beh Sci - Soc Sci course. |
7790 | 2019 | Fall | 2019-fa | SOCW | 199 | Undergraduate Open Seminar | Approved for letter and S/U grading. May be re... | 1 TO 4 hours. | Approved for letter and S/U grading. May be re... | NaN |
Another random selection of five rows (5 total rows).
Interested in learning different methods to select rows in a DataFrame? Check out our Guides on Row Selection using DataFrames:
Example Walk-Throughs with Worksheets
Video 1: Row Selection with Pandas (Part 1)
Video 2: Row Selection with Pandas (Part 2)
Practice Questions
Q1: Which answer shows two statements that will create the same subset of rows?Q2: I want to randomly select 5 movies from a dataset titled "topMovies.csv" holding the top 100 user rated movies on IMDB. How would I do this in Python?
Q3: You're looking for a new apartment for you and your cat and using a dataset "apartments" to look at listings. How would you filter the dataset in Python to only show pet-friendly listings, using the column "PetFriendly" which is either True or False?
Q4: In order to select the first 10 students from a class roster dataset titled "STAT207", what command would you use in Python?
Q5: A dataset "restaurants" contains the columns "Name", "Location", and "Rating" with ratings ranging from 0-100. How would you filter to only restaurants with a rating greater than 90 in Python?
Q6: If you had a dataframe titled "df" with a row titled "scores", how would you select the rows with the highest 3 scores in Python?