A 3D similarity matrix for mobile text misinformation identification

A 3D Similarity Matrix for Mobile Text Misinformation Identification
Abstract Mobile health text messages could be true, fake, or controversial. This research does not judge their correctness by its own standards. Instead, it tries to classify the messages as one of the five categories (true, fake, misinformative, disinformative, and neutral) based on previous known messages, and lets users to judge the correctness of the messages by themselves according to our recommendations. The 3D similarity matrix, which is used to classify the incoming mobile messages into the five classes by comparing the incoming message to the known messages saved in the database. Consider the following two messages:

The 2D similarity matrix is as follows:

	cdc	disease	control	prevent	center	covid-19	vaccine	prevent	effect
valid									x
omicron						x
booster							x
develop
pfizer
coronavirus						x
vaccine							x
cdc	x
center					x
disease		x
control			x
prevent				x				x

Another dimension is the synonyms. For example, Omicron, COVID-19, coronavirus, and delta are in another dimension. Once you have the sparse 3D matrix, you come up with a value by analyzing the matches like

number of keyword matching,
number and lengths of phrase matching, and
number and lengths of common subsequence matching.

The highest value from the incoming message and the saved message shows the category of the incoming message is the same as the one of the saved message.

Keywords
mobile computing, security, privacy, text mining, data mining, misinformation, misinformation identification, similarity measurement, 3D similarity matrix, sentence similarity, mobile data management

Conference