Session
FOSDEM 2021 Schedule
Python

Inventing Curriculum using Python and spaCy

Are you an educator who wants to design teach an industry-aligned curriculum? Then you have come to the right place. In this talk, we will show how to design a better curriculum using natural language processing libraries in python, i.e., spaCy and Textacy.
The curriculum in the general and undergraduate curriculum, in particular, is one of the most important pillars of an education system. The undergraduate curriculum has two main objectives i.e. employability and higher education. The greatest challenge in designing an undergraduate curriculum is achieving a balance between employability skills and laying the foundation for higher education. Generally, the curriculum is a combination of core technical subjects, professional electives, humanities, and skill-oriented subjects. We used natural language processing and machine learning packages in Python to build a curriculum design system. The steps to build a curriculum design system are described below: 1. The dataset was built from the job profiles from different job listing websites like stackoverflow.com, indeed.com, linkedin.com, and monster.com. Also from the syllabus of competitive exams and qualifying exams for higher education. 2. On the dataset, we applied natural language processing techniques to identify the subjects and subject content. For natural language processing, we used spaCy an industrial-strength Natural Language Processing package in Python. 3. To generate syllabus content for a particular subject, a pointer-generator network was used. The pointer generator network is a text summarization technique that combines extractive and abstractive summarization techniques. The extractive summarization technique extracts keywords from the dataset, whereas the abstractive summarization technique generates new text from the existing text. The pointer-generator network was implemented using the scikit-learn machine learning package in Python. 4. The generated curriculum was then compared with the existing curriculum to get insights like, how much percent of the curriculum is industry oriented, how much percent of the curriculum is aimed at higher education and job-oriented skills. At this step, we used the ROGUE (Recall-Oriented Understudy Gisting Evaluation) metric to compare the generated curriculum against the reference/proposed curriculum 5. The above steps can be repeated with modified parameters to get better insights and curriculum. This also gives us an idea of how we can have an evolving curriculum that can help us bridge the gap between industry and academia.

Additional information

Type devroom

More sessions

2/7/21
Python
D.python
We will talk about different approaches in teaching Data Science with the Python programming language. As a case study, we will use our own experience in providing Data Science education with Python across different audiences in the Asia Pacific region and share approaches and principles that worked for us. The lecture will serve as an anchor for more conversations and discussions for adapting pedagogy that is most effective for various contexts and settings.
2/7/21
Python
D.python
In 2020, with funding from Mozilla and CZI, the pip packaging team improved pip for all users. These focused on improving the depedenecy resolution for Python packages, and the user experience for all Python users. We carried out usability testing, user research and improved error and information messages for pip. This talk will be about these improvements - we'll explain how the new dependency resolver works, what it can (and can't do!), how we improved the user experience of pip, the ...
2/7/21
Python
D.python
Mypy has been around since 2012, and in recent years its gaining wide spread adoption. As the framework continues to evolve and improve, more and more useful features are being added. In this talk I'm presenting some hidden gems in the type system you can use to make your code better and safer!
2/7/21
Python
D.python
We made the Web accessible to human. What about making the web (of data) accessible to computers? Publish open data could be struggling. Deposit a CSV file on a web server is not enough. The data model used should be explicitly defined. The Linked Open Data (https://www.w3.org/standards/semanticweb/data) solves this with: * standardized implementation format (RDF) * standardized data access (Content negociation, SPARQL endpoint) * standardized data indentification (URI, data model as data) For ...
2/7/21
Python
D.python
Everybody hates mundane tasks, they are boring, repetitive and time-consuming. That’s why I love building bots, they can finish my tasks for me by working 24/7. But to build a bot to interact with the users, you have to write in async. If you are afraid of async, don’t worry! Today I am telling you how I learn using async and how I avoid checking in 500+ people in a conference by building a bot with Discord.py.
2/7/21
Python
D.python
As part of the schul-frei project of Teckids e.V. we curate free software and offer it to educational institutions. Besides the general equipment of schools with free software, an equal involvement of students in the development is for us important. One of the solutions presented by the project schul-frei is AlekSIS, a web-based school information system that is being developed jointly by Teckids e.V. and students of the Katharineum in Lübeck. The django-based platform provides data structures ...
2/7/21
Python
D.python
Pinax is an open-source ecosystem of reusable Django starter projects, apps, and themes for building websites. When developers began building Pinax in 2007, they had fun adding to it, but eventually Pinax had grown to become around 80 projects and apps. Without a strategy in place to make Pinax as easy as possible to maintain, the maintainers began to suffer burnout. I was hired to work on Pinax in the fall of 2017. In my talk, I'll outline the critical problems I've discovered and the solutions ...