THURS-108 - Working with Text-Based Data Sources: Essentials
Thursday, April 17, 2025
5:30 PM – 6:30 PM PST
Location: Pacific I/II, 2nd Floor
Area of Responsibility: Area I: Assessment of Needs and Capacity Subcompetencies: 4.3.2 Implement data collection procedures., 4.2.4 Assess capacity to conduct research.
Lead Data Consultant Superga Integrated Data Consulting Jamaica Estates, New York, United States
Learning Objectives:
At the end of this session, participants will be able to:
Describe the essential elements of a data ecosystem that uses text as data
Explain the methodological approaches, considerations and limitations of text analysis
Conduct a preliminary assessment of their own data systems for text analysis readiness
Brief Abstract Summary: Learn essential skills in working with free text-based data sources like that from social media and websites. This session will share results from an analysis of publicly available requests for proposals for natural language processing (NLP) digital transformation projects. Participants will learn what is needed to work with free text data feeds and to process those feeds into actionable data. This hands-on session will teach participants what they need to know and do to start working with free text data sources, such as those from social media, right away.
Detailed abstract description: Text-based forms of data represent an innovative data source that can powerfully advance the mission of public health and health service organizations. Many organizations want to use free text from a variety of sources, social media in particular, to further their work in communities. However, organizations can only benefit from these data sources if they can integrate free-form text data into their data ecosystems. We analyzed publicly available requests for proposals for NLP digital transformation projects to understand what organizations need now to be prepared to work with text data. Top results included: pipelines for continuous feeds of text data; ability to tokenize and store text data in structured formats; topic set-up, maintenance and analysis; ability to conduct sentiment analysis; and ability to connect to APIs. We will describe and clarify each of these components of text analysis. Participants will learn essential skills in NLP of free text data such as that from social media feeds. They will gain an understanding of what is needed to transform a typical existing tech stack to allow for continuous analysis of text data.