Search ThaiLIS Digital Collection 2019 x

แจ้งเอกสารไม่ครบถ้วน, ไม่ตรงกับชื่อเรื่อง หรือมีข้อผิดพลาดเกี่ยวกับเอกสาร ติดต่อที่นี่ ==>
หากไม่มีอีเมลผู้รับให้กรอก thailis-noc@uni.net.th

Nontakan Nuntachit.  Classification of COVID-19 medical articles using deep learning model.  Master's Degree(Data Science).  Chiang Mai University. Library. : Chiang Mai University, 2022.

Title

Classification of COVID-19 medical articles using deep learning model

Title Alternative

การจำแนกเอกสารทางการแพทย์ของโรคโควิด 19 โดยใช้โมเดลการเรียนรู้เชิงลึก

Creator

Name: Nontakan Nuntachit

Subject

LCSH: COVID-19 (Disease)

LCSH: COVID-19 (Disease) -- Research -- Case studies

LCSH: Medicine -- Research

Description

Abstract: The Global pandemic of Corona Virus Disease 19 (COVID-19) has made an impact on our daily life. After 2019, the literatures that focus on COVID-19 have rising exponentially. It is almost impossible for human to read all literatures and classify them. In this article, we propose the method to make an unsupervised model called zero-shot classification model from pre-trained BERT (Bidirectional Transfomers) model. We use CORD-19 dataset in conjunction with LitCovid database for construct new vocabulary and prepare test dataset. For Natural Language Inference (NLI) downstream task, we use three corpus – Standford Natural Language Inference (SNLI), Multi-Genre Natural Language Inference (MultiNLI) and MedNLI. We can significantly reduce the training time to build a task specific machine learning model by 98.2639%. The final model can run faster and use lower resources than the comparators. It has 27.84% accuracy which is lower than the best achieve accuracy by 6.73%, but it is comparable. Finally, we can identify that tokenizer and vocabulary that is more specific to COVID-19 do not outperform the generalization one, also BART architecture affects the classification result too.

Abstract: การระบาดใหญ่ทั่วโลกของโรคโควิด-19 ได้ส่งผลกระทบต่อชีวิตประจำวันของเราทุกคน ตั้งแต่ปี พ.ศ. 2562 งานวิจัยทางการแพทย์ที่เกี่ยวกับโรคโควิด-19 ได้เพิ่มขึ้นอย่างทวีคูณ และด้วยความสามารถ ของมนุษย์แทบจะเป็นไปไม่ได้เลยที่จะอ่านงานวิจัยทางการแพทย์นั้นทั้งหมด และทำการจำแนกชนิด ของงานวิจัยเหล่านั้น ในงานวิจัยนี้ทางผู้วิจัยได้ทำการนำเสนอวิธีการสร้างแบบจำลองที่ไม่มีผู้สอน (unsupervised learning) โดยเป็นแบบจำลองการจำแนกหมวดหมู่ชนิดซีโร่ช็อต (zero-shot classification mode) จากแบบจำลองเบิร์ตที่ผ่านการเรียนรู้มาก่อน (pre-trained BERT model) ผู้วิจัย ได้ใช้ชุดข้อมูลคอร์ด-19 (CORD-19 ; COVID-19 Open Research Dataset) ร่วมกับฐานข้อมูลลิตโควิด (LitCovid) เพื่อสร้างชุดคำศัพท์ใหม่และเตรียมชุดข้อมูลทดสอบ สำหรับงานดาวน์สตรีม (downstream task) รูปแบบการอนุมานภาษาธรรมชาติ (Natural Language Inference ; NLD) ผู้วิจัยได้ใช้ชุดข้อมูล เอสเอ็นแอลไอ (SNLI: Stanford Natural Language Inference corpus), มัลติเอ็นแอลไอ (MultiNLI ; Multi-Genre Natural Language Inference corpus) และเม็ดเอ็น แอลไอ(MedNLI) ผลการศึกษา ผู้วิจัยสามารถลดเวลาการสร้างแบบจำลองการเรียนรู้แบบเฉพาะงานของเครื่องจักร (task specific machine learning model) ลงได้ 98.2639% เมื่อเทียบกับวิธีมาตรฐาน แบบจำลองสุดท้าย สามารถทำงานได้เร็วขึ้นและใช้ทรัพยากรต่ำกว่าตัวเปรียบเทียบ โดยมีความแม่นยำอยู่ที่ 27.84% ซึ่งต่ำ กว่าความแม่นยำของแบบจำลองที่ดีที่สุด 6.73% โดยเมื่อเปรียบเทียบกันแล้วนั้นอยู่ในขั้นที่รับได้ นอกจากนั้นแล้ว ผู้วิจัยสามารถระบุได้ว่าการใช้โทเคนไนเซอร์ (tokenizer) และชุดคำศัพท์ที่ เฉพาะเจาะจงกับโรคโควิด- 19 นั้นไม่ได้มีประสิทธิภาพดีกว่าการใช้ชุดคำศัพท์แบบทั่วไป อีกทั้ง สถาปัตยกรรมของแบบจำลองบาร์ต (BART architecture) ก็ส่งผลต่อผลลัพธ์การจัดหมวดหมู่ด้วย เช่นกัน

Publisher

Chiang Mai University. Library

Address: CHIANG MAI

Email: cmulibref@cmu.ac.th

Contributor

Name: Prompong Sugunnasil

Role: Advisor

Date

Created: 2022

Modified: 2023-07-20

Issued: 2023-07-20

Type

วิทยานิพนธ์/Thesis

Format

application/pdf

Language

eng

Thesis

DegreeName: Master of Science

Level: Master's Degree

Descipline: Data Science

Grantor: Chiang Mai University

Rights

RightsAccess:

ลำดับที่.	ชื่อแฟ้มข้อมูล	ขนาดแฟ้มข้อมูล	จำนวนเข้าถึง	วัน-เวลาเข้าถึงล่าสุด
1	620631118.pdf	2.15 MB	3	2024-08-12 19:26:06

ใช้เวลา

0.022352 วินาที

Creator : Nontakan Nuntachit

Title	Contributor	Type
Classification of COVID-19 medical articles using deep learning model มหาวิทยาลัยเชียงใหม่ Nontakan Nuntachit	Prompong Sugunnasil	วิทยานิพนธ์/Thesis

Contributor : Prompong Sugunnasil

Title	Creator	Type and Date Create
Prediction of electricity consumption per capita using interpretable machine learning มหาวิทยาลัยเชียงใหม่ Prompong Sugunnasil	Theera Thongsanitkarn	วิทยานิพนธ์/Thesis
Quality analysis of graduates from Chiang Mai University using machine learning methods มหาวิทยาลัยเชียงใหม่ Nasi Tantitaranukul;Phisanu Chiawkhun;Prompong Sugunnasil;Pruet Boonma	Zihao Zhao	วิทยานิพนธ์/Thesis
Classification of COVID-19 medical articles using deep learning model มหาวิทยาลัยเชียงใหม่ Prompong Sugunnasil	Nontakan Nuntachit	วิทยานิพนธ์/Thesis
Syntactic differences between older adults with and without depressive disorders: a pilot study in Thailand มหาวิทยาลัยเชียงใหม่ Prompong Sugunnasil	Xu, Chengjie	วิทยานิพนธ์/Thesis
Prediction of employment region of graduates using machine learning approach มหาวิทยาลัยเชียงใหม่ Prompong Sugunnasil	Xiaohui, Bao	วิทยานิพนธ์/Thesis
Hallucination detection for large language model in medical context มหาวิทยาลัยเชียงใหม่ Prompong Sugunnasil	Pusit Seephueng	วิทยานิพนธ์/Thesis