About spreadsheet basics
- Get link
- X
- Other Apps
Microsoft Excel
Office Quick Starts: Scroll down to the Downloadable guides section to download the Excel Quick Start Guide: This PDF guide begins with a labeled map of Excel that can guide you through the basic tasks you can accomplish in Excel. For tips on starting and opening Excel, this Microsoft Support page will show you how to begin a new workbook.
Excel video training: This is a collection of step-by-step videos to use all sorts of Excel features, including adding and working within rows, columns, and cells; formatting; using formulas and functions; and adding charts and pivot tables.
Sort data in a range or table: This page guides you through all of the steps you will need to sort data by number, text, and color. You’ll also have the option to sort by custom list so that you can customize exactly what you want to sort.
Filter data in a range or table: This article has step-by-step instructions on how to filter an Excel spreadsheet to show only the data you want to see. You can also use built-in comparison operators, such as “greater than” and “top 10” to reveal only the most relevant data.
Format a worksheet: The guide will help you select and format your Excel spreadsheet, then change the borders, shading, colors, and text. This can help improve your spreadsheet’s readability.
Pro tip: If you’re searching for information about using customizable options, check out Microsoft’s Guidelines for organizing and formatting data on a worksheet. This article provides clear methods for creating easy-to-read spreadsheets.
Google Sheets
Google Sheets cheat sheet: The cheat sheet puts all the basics of Sheets on a single page for easy reference. Here, you can learn about customizing your spreadsheet and the data inside; working with rows, columns, and cells; sharing your spreadsheet with others; creating different versions and copies of a spreadsheet; and more.
Get started with Sheets: Create and import files: This guide is a step-by-step guide for working with Sheets. You start by learning how to open a spreadsheet, then move on to adding data.
Sort and filter your data: This resource can help you organize data in Sheets. Use this guide to sort part or all of a spreadsheet. You can sort by text, number, and color. Then, learn how to create filters to show only certain data while hiding the rest. Finally, the article includes information on creating, saving, and removing a filter view.
Edit and format a spreadsheet: This will help you make easy-to-read spreadsheets. You will learn how to assign a color, customize borders around cells, and change the appearance of text. If you’d like to give your spreadsheet a theme, you can scroll to the bottom of the page and find how to apply it to parts of your spreadsheet.
Tip: Microsoft Excel and Google Sheets are very similar in terms of calculations, formulas, functions, and many other features. But there are some differences, which can make it tricky to switch from one to the other. If you are moving between Excel and Google Sheets, find a quick list of the differences between the two kinds of spreadsheet applications in Overview: Differences between Sheets and Excel.
Ref:
https://support.google.com/a/users/answer/9331278?hl=en
https://support.google.com/a/users/answer/9300311?hl=en&ref_topic=9296423
https://support.google.com/docs/answer/46973?co=GENIE.Platform%3DDesktop&hl=en&oco=0
https://support.google.com/a/users#topic=9917952
From external source to a spreadsheet
When you work with spreadsheets, there are a few different ways to import data. This reading covers how you can import data from external sources, specifically:
Other spreadsheets
CSV files
HTML tables (in web pages)
Importing data from other spreadsheets
In a lot of cases, you might have an existing spreadsheet open and need to add additional data from another spreadsheet.
Google Sheets
In Google Sheets, you can use the IMPORTRANGE function. It enables you to specify a range of cells in the other spreadsheet to duplicate in the spreadsheet you are working in.
You must allow access to the spreadsheet containing the data the first time you import the data. The URL shown below is for syntax purposes only. Don't enter it in your own spreadsheet. Replace it with a URL to a spreadsheet you have created so you can control access to it by clicking the Allow access button.

Refer to the Google Help Center's IMPORTRANGE page for more information about the syntax. There is also an example of its use later in the program in Advanced functions for speedy data cleaning.
Microsoft Excel
To import data from another spreadsheet, do the following:
Step 1: Select Data from the main menu.
Step 2: Click Get Data, select From File, and then select From Workbook.
Step 3: Browse for and select the spreadsheet file and then click Import.
Step 4: In the Navigator, select which worksheet to import.
Step 5: Click Load to import all the data in the worksheet; or click Transform Data to open the Power Query Editor to adjust the columns and rows of data you want to import.
Step 6: If you clicked Transform Data, click Close & Load and then select one of the two options:
Close & Load - import the data to a new worksheet
Close & Load to... - import the data to an existing worksheet
Importing data from CSV files
Google Sheets
Step 1: Open the File menu in your spreadsheet and select Import to open the Import file window.

Step 2: Select Upload and then select the CSV file you want to import.

Step 3: From here, you will have a few options. For Import location, you can choose to replace the current spreadsheet, create a new spreadsheet, insert the CSV data as a new sheet, add the data to the current spreadsheet, or replace the data in a specific cell. The data will be inserted as plain text only if you uncheck the Convert text to numbers, dates, and formulas checkbox, which is the default setting. Sometimes a CSV file uses a separator like a semi-colon or even a blank space instead of a comma. For Separator type, you can select Tab or Comma, or select Custom to enter another character that is being used as the separator.

Step 4: Select Import data. The data in the CSV file will be loaded into your sheet, and you can begin using it!
Note: You can also use the IMPORTDATA function in a spreadsheet cell to import data using the URL to a CSV file. Refer to Google Help Center's IMPORTDATA page for more information and the syntax.
Microsoft Excel
Step 1: Open a new or existing spreadsheet
Step 2: Click Data in the main menu and select the From Text/CSV option.
Step 3: Browse for and select the CSV file and then click Import.
Step 4: From here, you will have a few options. You can change the delimiter from a comma to another character such as a semicolon. You can also turn automatic data type detection on or off. And, finally, you can transform your data by clicking Transform Data to open the Power Query Editor.
Step 5: In most cases, accept the default settings in the previous step and click Load to load the data in the CSV file to the spreadsheet. The data in the CSV file will be loaded into the spreadsheet, and you can begin working with the data.
Importing HTML tables from web pages
Importing HTML tables is a very basic method to extract or "scrape" data from public web pages. Web scraping made easy introduces how to do this with Google Sheets or Microsoft Excel.
Google Sheets
In Google Sheets, you can use the IMPORTHTML function. It enables you to import the data from an HTML table (or list) on a web page.

Refer to the Google Help Center's IMPORTHTML page for more information about the syntax. If you are importing a list, replace "table" with "list" in the above example. The number 4 is the index that refers to the order of the tables on a web page. It is like a pointer indicating which table on the page you want to import the data from.
You can try this yourself! In blank worksheets, copy and paste each of the following IMPORTHTML functions into cell A1 and watch what happens. You will actually be importing the data from four different HTML tables in a Wikipedia article: Demographics of India. You can compare your imported data with the tables in the article.
=IMPORTHTML("http://en.wikipedia.org/wiki/Demographics_of_India","table",1)
=IMPORTHTML("http://en.wikipedia.org/wiki/Demographics_of_India","table",2)
=IMPORTHTML("http://en.wikipedia.org/wiki/Demographics_of_India","table",3)
=IMPORTHTML("http://en.wikipedia.org/wiki/Demographics_of_India","table",4)
Exp:
Microsoft Excel
You can import data from web pages using the From Web option:
Step 1: Open a new or existing spreadsheet.
Step 2: Click Data in the main menu and select the From Web option.
Step 3: Enter the URL and click OK.
Step 4: In the Navigator, select which table to import.
Step 5: Click Load to load the data from the table into your spreadsheet.
- Prepare Data for Exploration
- Week 3
- Hands-On Activity: Clean data in spreadsheets with sorting and filtering
- Video: LectureExploring metadata. Duration: 3 minutes3 min
- Reading: Metadata is as important as the data itself. Duration: 10 minutes10 min
- Video: LectureUsing metadata as an analyst. Duration: 3 minutes3 min
- Video: LectureMetadata management. Duration: 3 minutes3 min
- Video: LectureMegan: Fun with metadata. Duration: 2 minutes2 min
- Practice Quiz: Test your knowledge on metadata4 questions
- Video: LectureWorking with more data sources. Duration: 3 minutes3 min
- Reading: From external source to a spreadsheet. Duration: 10 minutes10 min
- Video: LectureImporting data from spreadsheets and databases. Duration: 3 minutes3 min
- Reading: Exploring public datasets. Duration: 10 minutes10 min
- Practice Quiz: Test your knowledge on accessing data sources3 questions
- Video: LectureSorting and filtering. Duration: 5 minutes5 min
- Practice Quiz: Hands-On Activity: Clean data in spreadsheets with sorting and filtering2 questions
- Practice Quiz: Self-Reflection: Considering databases and spreadsheets for sorting and filtering1 question
- Practice Quiz: Test your knowledge on sorting and filtering4 questions
- Video: LectureSetting up BigQuery, including sandbox and billing options. Duration: 3 minutes3 min
- Video: LectureHow to use BigQuery. Duration: 4 minutes4 min
- Video: LectureBigQuery in action. Duration: 6 minutes6 min
- Reading: Using BigQuery. Duration: 10 minutes10 min
- Practice Quiz: Hands-On Activity: Introduction to BigQuery2 questions
- Practice Quiz: Hands-On Activity: Create a custom table in BigQuery2 questions
- Reading: In-depth guide: SQL best practices. Duration: 10 minutes10 min
- Practice Quiz: Hands-On Activity: Applying SQL2 questions
- Practice Quiz: Test your knowledge on using SQL with large datasets4 questions
Hands-On Activity: Clean data in spreadsheets with sorting and filtering
Practice Quiz • 1h
Hands-On Activity: Clean data in spreadsheets with sorting and filtering
1.
Question 1
Activity overview

By now, you have been introduced to cleaning data in spreadsheets, as well as core spreadsheet skills such as sorting and filtering. In this activity, you will use sorting and filtering to clean up a dirty dataset.
Data cleaning corrects or removes incorrect, missing, and faulty data. Cleaning data is of critical importance because an analysis based on dirty data can lead to wrong conclusions and bad decisions. The cleaner your data, the better your results.
For this activity, imagine you are a data analyst working for the superintendent of a large public school district in Portugal. The superintendent wants to know what factors affect student grades in core subjects and what changes can be made to improve student performance. Your team is going to analyze performance data on high school student achievement in two Portuguese public schools, Gabriel Pereira (GP) and Mousinho da Silveira (MS). The data was collected by the school district by means of academic reports and student surveys. The data includes information such as:
Student grades
Student background information
Student study time
Student participation in extracurricular activities
However, before analyzing the data, it’s important to make sure the data is clean. Analyzing bad or dirty data could cause the school district to reach the wrong conclusions and implement ineffective changes. Your assignment is to help clean the data.
By the time you complete this activity, you will be able to sort data in different ways, apply filters to remove incorrect data and fill in missing data, and convert text data to number format. Cleaning data is a critical phase of the data analysis process. Sorting and filtering are useful techniques for data cleaning, and are also key skills that you will draw on throughout your career as a data analyst.

What you will need
To get started, access the spreadsheet that contains the data. Click the link and make a copy of the spreadsheet.
Or, if you don’t have a Google account, you may download the dataset directly from the attachment below:

Clean your data

It’s important to make sure your data is clean so that your eventual analysis will be correct. The first thing to do is check the values in the columns most relevant to your analysis and find out if there is anything for you to clean. In this example, the superintendent’s main objective is to determine what factors drive student performance. To begin answering this question, the columns you want to focus on first are school, age, reason, Medu, Fedu. You can use sorting and filtering to clean the data in each of these columns.
Sorting data

Because you have data from two schools, Gabriel Pereira (GP) and Mousinho da Silveira (MS), you can start by sorting the data by school. Then, you can also sort by age to discover the age ranges of the students for each school. Sorting involves arranging data into a meaningful order to make it easier to understand, analyze, and visualize.
To start, rename your spreadsheet. In the upper left corner, click Untitled Spreadsheet and enter a new name. You can use the name student_performance_data or a similar name that describes the data your spreadsheet contains.
Now, sort by school. Because you want to sort on multiple columns, you need to select all the data in your spreadsheet. Click the blank rectangle above row 1 and to the left of column A. This lets you select all the data on your sheet.

3. Next, from the menu bar, select Data, then Sort range. (Note: For some versions of Google Sheets, the selection Advanced range sorting options may appear on the Data drop-down menu instead of Sort range).
4. In the pop-up window, select Data has header row. Now you can choose specific column headers to sort by.
5. In the Sort by dropdown, choose the header school. Then, click A → Z to sort in ascending order.
6. You also want to sort for age. Before you can sort by age, you need to click Add another sort column to choose a second column header.
7. In the Sort by dropdown, choose the header age. This time, click Z → A to sort in descending order. This way, the oldest students will be listed first.
Your popup window should appear like this:

8. Once both selections have been made, click Sort.
Now, If you scroll through the data, you’ll notice that the age range of the students at Gabriel Pereira (GP) is 15-22 years, and the age range of the students at Mousinho da Silveira (MS) is 15-20 years. It appears that both schools have similar age ranges, but the GP school has students that are a little older.
By sorting the data, you’ve discovered a potential problem with the data. Because this dataset represents high school student achievement, any age older than 18 may indicate that a mistake was made when entering that student's age. You now know what age data may need to be researched and corrected. Your next step is to ask the superintendent about the legitimate age range for students in public high school. Then, you’ll know what age data is incorrect and should be removed.
Removing incorrect data

The superintendent tells you that the maximum age limit for which public education is provided is 19 years old and that the age range should be 15-19 for both schools. Any student outside this age range should be deleted from the dataset.
To clean your data, you need to remove the ages 20, 21, and 22 from your dataset. You can start by applying a filter to the age column. Filtering is the process of showing only the data that meets a specified criteria while hiding the rest. Filtering makes it easier to find data that you need.
First, apply a filter to the age column. Select the age column by clicking the letter at the top of the column (C).
Then, from the menu bar, select Data, then Create a filter.
You can now inspect the values in the age column by going to the top of the column and clicking the Filter icon ().
In Google Sheets, there are nine possible values for the field (15, 16, 17, 18, 19, 20, 21, and 22). You may notice that all the values have check marks. Filter this column for the values you want to select by unchecking all the other values (15, 16, 17, 18, and 19).

5. Then, click OK. This will single out the rows that contain the ages 20, 21, and 22. After you apply the filter, there should be nine such rows (seven for the GP school and two for the MS school).

6. To delete the nine rows, first select them by clicking their row numbers.
7. Then, from the menu bar, select Edit and Delete selected rows.
8. Click the Filter icon at the top of the age column to inspect the values once again. Now that you’ve removed the three incorrect ages (20, 21, and 22), there are five ages remaining (15, 16, 17, 18, and 19). The remaining ages are legitimate and can be used for analysis.
9. Finally, turn off the filter. From the menu bar, select Data and Turn off filter.
Filling in missing data

Filling in missing data is an important part of data cleaning. It’s your job to fill in these blank spaces in your data with accurate values.
The superintendent wants to know what factors influence student performance, and a student’s reason for choosing a specific school will be important to know for analysis. The reason column shows the main reason a student chose to enroll in a specific school, according to their survey response: for example, because of the school’s reputation, or because it offers certain courses, etc. So, you need to make sure the reason column is complete and without blanks.
Start by applying a filter across the entire spreadsheet. Click on any cell in the sheet. Then, from the menu bar, select Data and Create a filter.
All the cells are now highlighted, and there are filters at the top of every column containing data. Click the Filter icon on the reason column (K).
You may notice that the data values in the reason column include blanks. Filter this column for blanks by unchecking all the other values (course, home, reputation).

4. Then, click OK. Now, your sheet shows all the blank rows in the reason column.

5. To clean your data, you need to find a good way to fill in these missing values. In this case, you cannot know what each missing value should be (that is, without a new survey, you can’t discover each student’s reason for choosing a specific school). So, you can replace the missing values with the value none_given. To do this while the column is still filtered for blanks, type none_given in the first empty cell (K2). Then, press Enter.
6. Select cell K2 again. A small blue square, known as the fill handle, appears in the bottom-right corner of the cell. Double click the fill handle to fill all the other blank cells with the value none_given.
7. Finally, turn off the filter. From the menu bar, select Data and Turn off filter. If you scroll down the reason column, you should find that the value none_given has replaced all the blanks in the reason column.
Converting data

During the data analysis process, it's sometimes necessary to change text data (words) to numeric data (numbers). For example, some statistical packages like those used to perform machine learning will only accept numeric data values as input.
In this case, the superintendent wants to know if a parent’s education level is a significant factor in student performance. The relevant data is in the Medu and Fedu columns--which, respectively, refer to the level of education of a student’s mother and father. Currently, the data is in text format. For the purposes of analysis, it will be useful to know the average education level of each student’s parents. To make this calculation, you first need to convert the data in the Medu and Fedu columns to number format.
To do this, you can match specific number values to the text data in each column. Start with the Medu column. If you click on the Filter icon at the top of the Medu column (G), you’ll notice the column contains the text data shown in the table below. You can use the following numeric codes for each piece of text data:

To start, remove the filter from the Medu column.
Next, select the unfiltered Medu column data by clicking its column letter (G).
Then, from the menu bar, select Edit, then Find and replace.
Fill in the popup window for the none value. Next to Find, type none. Next to Replace with, type 0. Check the box next to Match entire cell contents.

5. Then, click Replace all.
6. While still in the popup window, repeat this process (steps 4-5) for the other four educational levels: primary education (4th grade), 5th to 9th grade, secondary education, and higher education.
7. After replacing all five educational levels with numeric values, click Done to close the pop-up window.
8. Check out your spreadsheet. All the cells in the Medu column now display numeric values.
9. Change the text data in the Fedu column (H) in the same way.
Confirmation and reflection

What is the process of showing only the data that meets a specified criteria while hiding the rest?
Filtering is the process of showing only the data that meets a specified criteria while hiding the rest. Filtering is an extremely useful technique for data cleaning, and an essential tool in every data analyst’s toolkit.
2.
Question 2In the text box below, write 2-3 sentences (40-60 words) in response to each of the following questions:
Why is cleaning data such an important part of the data analysis process?
How can sorting and filtering help you clean data more effectively?
Congratulations on completing this hands-on activity! In this activity, you used sorting and filtering to clean “dirty” data in a spreadsheet.
Cleaning data is an important part of the data analysis process. If data analysis is based on bad or dirty data, it may be biased, erroneous, and uninformed. Sorting and filtering are essential skills for every data analyst, and are also very useful for cleaning data. In upcoming activities, you will continue to learn more about the most effective and efficient ways to clean data.
- Get link
- X
- Other Apps
Comments
Post a Comment