LMS Toolkit User Guide - Extract Data From the LMS
The LMS Toolkit provides utilities called "extractors" that fetch data from your Learning Management System (LMS).
These extractors may be used as part of a larger "pipeline" that moved data from your LMS to your ODS. For how to information on that second step, see LMS Toolkit User Guide - Load Data into the ODS
Step 1: Create an Extractor Configuration File
The first step in the pipeline is the extractor. Extractors connect to the API (or other applicable interface) of the LMS and extract the data.
To run an extractor, you need to provide the extractor the information needed to connect to your LMS. The simplest way to do this is to put this data into a file named ".env" that sits in the same directory as the extractor. The LMS Toolkit gives you a sample file named ".env.example" to work with. You can simply rename this file to get started. You can find those files at this location (use the link that corresponds to your LMS platform; [INSTALL DIRECTORY] is where the LMS Toolkit is installed):
Canvas
[INSTALL DIRECTORY]/LMS-Toolkit/src/canvas-extractor/.env.example
Google Classroom
[INSTALL DIRECTORY]/LMS-Toolkit/src/google-classroom-extractor/.env.example
Schoology
[INSTALL DIRECTORY]/LMS-Toolkit/src/schoology-extractor/.env.example
Make a copy of this file and name it .env. Leave it in the same directory as the ".env.example" file.
The file for Canvas LMS would look like this:
CANVAS_BASE_URL=[CANVAS_BASE_URL]
CANVAS_ACCESS_TOKEN=[CANVAS_ACCESS_TOKEN]
START_DATE=[CLASS_START_DATE]
END_DATE=[CLASS_END_DATE]
OUTPUT_DIRECTORY=data
SYNC_DATABASE_DIRECTORY=data
FEATURE=[activities, attendance, assignments, grades]
# options: DEBUG, INFO, WARNING, ERROR, CRITICAL
LOG_LEVEL=INFO
Each .env file has 2 kinds of information in it:
- Configuration info that is SPECIFIC to the LMS you are trying to connect to.
- Configuration info that is common to ALL LMS toolkit extractors
In the sections below we cover how to complete both of these sets of configuration options.
Step 2: Add LMS-specific Configuration To Your Config File
In the config file, you now need to fill out the elements specific to your LMS.
Consult the section below to find the configuration options specific to your LMS
Canvas
Canvas requires these four connection configuration parameters:
Parameter name | What to put there |
---|---|
CANVAS_BASE_URL | This points to the default URL for the entire canvas deployment, which might be something like "canvas.myschooldistrict.edu" or "myschooldistrict.edy/canvas/2021" etc. |
CANVAS_ACCESS_TOKEN | This is the Canvas API access token that you retrieved from the Canvas administration panel. |
START_DATE | Start Date and End Date are the dates between which you want to pull data assignment, activity and other data, We recommend by default that these dates span a semester or similar school calendar timespan. |
END_DATE |
Google Classroom
Google Classroom requires this one connection configuration parameters:
Parameter name | What to put there |
---|---|
CLASSROOM_ACCOUNT | This is the email address of the Google Classroom admin account. |
Schoology
Schoology requires these two connection configuration parameters:
Parameter name | What to put there |
---|---|
SCHOOLOGY_KEY | This is the Schoology API key that you retrieved from the Schoology administration panel. |
SCHOOLOGY_SECRET | This is the Schoology API secret value that you retrieved from the Schoology administration panel. |
Now you should have a .env config file that is partially filled out.
Step 3: Test the Connection to the LMS
At this point, you should have a .env file has the info it needs to connect to the LMS.
Next remove all the lines from the .env file that you did not edit above (lines that are not-LMS specific). After doing that, you should have a file that looks like this (click on the link corresponding to your LMS):
Canvas
CANVAS_BASE_URL=https://myschooldistrict/canvas
CANVAS_ACCESS_TOKEN=a1866c3d1e9a47c893d84d97d6584a0f
START_DATE=08/10/2021
END_DATE=12/17/2021
Google Classroom
CLASSROOM_ACCOUNT=google-admin@mydistrict.edu
Schoology
SCHOOLOGY_KEY=RvcohKz9zHI4
SCHOOLOGY_SECRET=RvcohKz9zHI4
Next, open a command line tool (Windows command, PowerShell, etc.) and navigate to the root director for the extractor. Install the tool's dependencies before you execute the tool for the first time.
Canvas
cd [INSTALL DIRECTORY]/LMS-Toolkit/src/canvas-extractor/
# One-time execution:
poetry install
Google Classroom
cd [INSTALL DIRECTORY]/LMS-Toolkit/src/google-classroom-extractor/
# One-time execution:
poetry install
Schoology
cd [INSTALL DIRECTORY]/LMS-Toolkit/src/schoology-extractor/
# One-time execution:
poetry install
Finally, run the command that corresponds to your extractor
Canvas
poetry run python edfi_canvas_extractor
Google Classroom
poetry run python edfi_google_classroom_extractor
Schoology
poetry run python edfi_schoology_extractor
You should see data flowing and lines that look like the following appear.
2021-09-03 17:45:41,185 - INFO - edfi_canvas_extractor.extract_facade - Starting Ed-Fi LMS Canvas Extractor
2021-09-03 17:45:41,213 - INFO - edfi_canvas_extractor.extract_facade - Extracting Courses from Canvas API
2021-09-03 17:45:41,213 - INFO - edfi_canvas_extractor.api.courses - Pulling course data
2021-09-03 17:45:43,925 - INFO - edfi_canvas_extractor.extract_facade - Extracting Sections from Canvas API
2021-09-03 17:45:43,925 - INFO - edfi_canvas_extractor.api.sections - Pulling section data
2021-09-03 17:46:08,866 - INFO - edfi_canvas_extractor.extract_facade - Writing LMS UDM Sections to CSV file
2021-09-03 17:46:08,870 - INFO - edfi_lms_extractor_lib.csv_generation.write - Generated file => c:\source\ed-fi\lms-data\canvas\sections\2021-09-03-17-46-08.csv
Congratulations: you are on your way to using LMS data!
It may take a bit of time to complete this run, as you are now extracting basic data from your LMS on users and sections. if your school district is large, that might be tens of thousands of records, and will take some time to complete.
In the next steps, we expand the config file to tell the extractor what LMS data we need, and decide if we need to do other optional configuration - keep reading to learn more.
If there are errors
Read the error messages and copy them down in case you need to share them with others. The most likely causes of errors are:
- The configuration information is incorrectly entered. Double check the info in the .env file to be sure it was copied correctly
- There is some network configuration that is blocking access from the machine you are on to the LMS. To resolve this item, you will need to consult with a network person at your organization.
Step 4: Add General Configuration to your Config File
Now that the extractor is working, you need to tell it what data you need, and decide if you need to do other optional configuration.
The most important configuration is to tell the extractor what data you need. Data is organized into domains or "features". To set this up, add a line to your .env file that looks like this:
FEATURE=assignments
This line tells the extractor that you want all data related to assignments.
These are the features available. Please note that some are under development/experimental (see column 3) and that the Toolkit only extracts some of the LMS data in this area.
Feature | Experimental? | Description |
---|---|---|
assignments | No | Extracts data related to student assignments and student assignment submissions. |
activities | No | Extracts data related to student activities - logins/logouts (if available), discussion forum posts, etc. |
grades | YES | If the LMS has a gradebook integrated, extracts grades |
attendance | YES | If the LMS has an attendance system integrated, extracts attendance data. Note: this is not a value calculated by he LMS Toolkit and does not attempt to derive if the student was in attendance or not! This is literally only what is in the LMS attendance system |
To request multiple features, you can separate them with commas
FEATURE=[assignments, activities]
You may have questions about what exact data is included. That information is LMS-specific. If you need that info, you can consult: LMS Unifying Data Model#AdditionalMappingNotes
At this point, we recommend that you these two lines to your .env file - this tells the extractor to gather assignments data.
FEATURE=[assignments]
LOG_LEVEL=warn
This second line just tells the extractor to reduce the amount of text it spews out, if effect saying: "just tell me when things go wrong."
You should now have a file that looks like this (click on the link corresponding to your LMS).
Canvas
CANVAS_BASE_URL=https://myschooldistrict/canvas
CANVAS_ACCESS_TOKEN=a1866c3d1e9a47c893d84d97d6584a0f
START_DATE=08/10/2021
END_DATE=12/17/2021
FEATURE=[assignments]
LOG_LEVEL=warn
Google Classroom
CLASSROOM_ACCOUNT=google-admin@mydistrict.edu
FEATURE=[assignments]
LOG_LEVEL=warn
Schoology
SCHOOLOGY_KEY=RvcohKz9zHI4
SCHOOLOGY_SECRET=RvcohKz9zHI4
FEATURE=[assignments]
LOG_LEVEL=warn
Now run the extractor using the same command you used above. You should see data flowing again!
It may take some time to complete this run, as you are now extracting data on ALL assignments and ALL assignment submissions. If your school district is quite large, this could take a while.
Step 5: Verify Your Data
The fist step to verifying that the extractor got the data is to look for errors during the extractor run. Look and see if any lines that start with "Warning" or "Error" appear. There.
A further inspection is to look in the filesystem for the files that the extractor created. By default, the data is extracted to a directory named "data" in the same directory where the extractor lives. If you open up that directory, you should see something that looks like this:
The presence of a directory with a name denotes that the extractor is attempting to gather that class of data. For example, in the screenshot above you can see many "assignment=*" folders under a "section=*" folder: this is the extractor gathering data on individual assignments for a section.
If extracting data from the LMS at a large school district, your filesystem should contain tens of thousands of such files.
Step 6: Set any Additional Configuration Options (optional)
Below are listed all the possible configuration options and which extractor each applies to. As you prepare the extractor for regular usage, you may choose to tune some of these options.
When running with the command line tool, you can also provide the --help
option to get the full set of options for that extractor.
Applies To | Argument | Required? | Purpose |
---|---|---|---|
All | Feature | No | Define which optional features are to be retrieved from the upstream system. Default: none. Available features:_Activities : encompassing section activities and_system activities. Attendance : attendance data. Only applies to Schoology.**Assignments** : encompassing assignments and _submissions. **Grades** : section-level grades (assignment grades are included on the submissions resource). Experimental and only implemented for Canvas at this time. Note: Sections, Section Associations, and Users are always pulled from the Source System. |
Log Level | No | Valid options are: DEBUG, INFO (default), WARNING, ERROR, CRITICAL | |
Output Directory | No | The output directory for the generated CSV files. Defaults to: ./data . | |
Sync database directory | No | Directory for storing a SQLite database that is used in support of synchronizing the data between successive executions of the tool. Defaults to: ./data . | |
Google Classroom | Classroom account | Yes | The email address of the Google Classroom admin account. |
Usage start date | No | Start date for usage data pull in YYYY-MM-DD format. | |
Usage end date | No | End date for usage data pull in YYYY-MM-DD format. | |
Schoology | Client key | Yes | Schoology client key. |
Client secret | Yes | Schoology client secret. | |
Page size | No | Page size for the paginated requests. Defaults to: 200. Max value: 200. | |
Input directory | No | Input directory for usage CSV files. | |
Canvas | Base URL | Yes | The Canvas API base url. |
Access token | Yes | The Canvas API access token | |
Start Date | Yes | Start date for the range of classes and events to include, in YYYY-MM-DD format. | |
End Date | Yes | End date for the range of classes and events to include, in YYYY-MM-DD format. |
Are you a software developer familiar with Python? If you are, you can likely skim (or skip) much of this material and go directly to the README files in the GitHub repo! See https://github.com/Ed-Fi-Alliance-OSS/LMS-Toolkit for the best starting point!