Course Code: AIB525

Synopsis

This course AIB525 Advanced Topics in Natural Language Processing provides an introduction to novel and effective machine learning techniques to understand and analyse human language for business applications. The module will explain the key approaches of acquiring and handling natural textual information using Python. Students will learn to pre-process and transform unstructured/semi-structured textual data into analysable format. Next, students will learn advanced data science solutions created for Natural Language Processing (NLP) including word, sentence and document embeddings, word clouds, Named Entity Recognition (NER), sentiment analysis, topic modelling, and influencer/topic network analysis enabled by supervised, unsupervised an deep machine learning techniques. Real-world examples, hands-on class exercises and assignments will be designed to help students to prepare and analyse data by harnessing Python’s libraries and using NLP techniques to extract new knowledge being integrated into data analysis workflows and improve business performance. 本课程AIB525自然语言处理高级课题介绍了新颖有效的机器学习技术,用于理解和分析自然语言以应用于商业领域。该课程将解释使用Python获取和处理自然文本信息的关键方法。学生将学习将非结构化/半结构化文本数据预处理和转换为可分析格式。另外,学生将学习为自然语言处理(NLP)创建的高级数据科学解决方案,包括词、句子和文档嵌入、词云、命名实体识别(NER)、情感分析、主题建模以及通过监督、无监督和深度机器学习技术实现的影响者/主题网络分析。通过真实案例、动手课堂练习和作业,本课程旨在帮助学生通过利用Python库和使用NLP技术来提取新知识并将其整合到数据分析工作流程中,从而改进业务绩效。
Presentation Pattern: EVERY JAN

Topics

  • Basics of Natural Language Processing (NLP) 自然语言处理(NLP)基础知识
  • Build NLP Vocabulary Using Python 使用Python构建NLP词汇
  • Bag-of-Words (BoW) and TF-IDF Vectors 词袋模型(BoW)和TF-IDF向量
  • Pre-trained Word2vec Model 预训练的Word2vec模型
  • Venture into Doc2vec, Sent2Vec and Universal Sentence Encoder 探索Doc2vec、Sent2Vec和通用语句编码器
  • Word Cloud Analysis 词云分析
  • Standard and Customised Sentiment Analyser 标准和定制情感分析器
  • Entity Recognition Techniques 实体识别技术
  • Evaluation Metrics of Classifier 分类器的评估指标
  • Topic Modelling Techniques 主题建模技术
  • Topic and Influencer Network 主题和影响者网络
  • State of the Art in NLP - Artificial Neural Networks (ANNs), Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) NLP的最新技术 - 人工神经网络(ANNs)、卷积神经网络(CNNs)和循环神经网络(RNNs)

Learning Outcome

  • Develop know-how of the applications of AI tools and technologies in areas of NLP 发展对NLP工具和技术在不同领域的应用的了解
  • Assess the possibilities and implications of NLP in different industries 评估NLP在不同行业中的可能性和影响
  • Formulate NLP analytics strategies 制定NLP分析策略
  • Prepare unstructured/semi-structured data into analysable format 将非结构化/半结构化数据准备成可分析的格式
  • Design NLP solutions through supervised, unsupervised and deep machine learning techniques 通过监督、无监督和深度机器学习技术设计NLP解决方案
  • Revise data analysis workflows by integrating NLP solutions to improve business performance 通过将NLP解决方案整合到数据分析工作流程中以改进业务绩效