Event box

Foundations and Techniques of Text Analysis and Natural Language Processing

Foundations and Techniques of Text Analysis and Natural Language Processing In-person

The Unit for Data Science and Analytics at Hayden Library, Arizona State University is thrilled to call for applications for a 3-day workshop on text analysis designed for researchers and professionals looking to integrate text analysis techniques and data to their projects. Over three days, participants will be introduced to the essentials of text analysis, starting with a quick overview and comparison with Natural Language Processing (NLP).

  • Day 1: Basics of Text Analysis
    • Defining text analysis and NLP
    • Primary Steps in Text Analysis
    • Text Preprocessing
    • Bag of Words approach  
  • Day 2: Text analysis with Supervised ML
    • OLS, Logistic Regression, and Regularization
    • Support Vector Machines
    • K-Nearest Neighbors (KNN)
  • Day 3: Text Analysis with Unsupervised and Semisupervised ML
    •  Latent Dirichlet Allocation (LDA)
    •  Structural Topic Models (STM)
    • keyATM
    • Word Embeddings
Date:
Wednesday, June 26, 2024 Show more dates
Time:
10:00 am - 11:45 am
Time Zone:
Arizona Time (change)
Location:
Hayden 3rd Floor NW (by the TV wall)
Campus:
Tempe campus
Registration has closed.

On the first day, the workshop will cover the necessary steps involved in text analysis and discuss the application of various statistical software tools. On the second and third days of the workshop, we explore the applications of machine learning (ML) in text analysis. Day two will cover some fundamental algorithms such as OLS, logistic regression, and support vector machines, as well as K-Nearest Neighbors (KNN). Participants will learn how to leverage these methods to derive insights from text data. The final day will explore advanced topic modeling techniques, including Latent Dirichlet Allocation (LDA), Structural Topic Models (STM), keyATM as well as word embeddings. This workshop offers valuable skills primarily for academic research as well as techniques that could be used in private sector. Participants are very welcome and encouraged to bring their project ideas or ongoing projects that involve unstructured text data and text methods. In addition to the workshop, we will offer office hours where attendees can have one-on-one discussions with the instructor to get personalized guidance on their projects, corpus creation, text vectorization, and the use of text methods. We will primarily use R for this workshop.

Event Organizer

Profile photo of Namig Abbasov
Namig Abbasov
Profile photo of Kerri Rittschof
Kerri Rittschof