Denouncing the Plagiarism of "Automatically Categorizing Software Technologies"

Initially published in IEEE Transactions on Software Engineering, 2018

Annotated view of Khan and Butt's ICoDT2 article with paragraphs copied from Nassif et al.'s TSE article highlighted in orange (paraphrased) and red (verbatim).

Our IEEE TSE article Automatically Categorizing Software Technologies was plagiarised by this one published in the IEEE 2nd International Conference on Digital Futures and Transformative Technologies using tortured phrases.

The image shows an annotated version of their article, with content reused without attribution highlighted in red (verbatim copy) and orange (tortured phrases etc.). Below are some of the most entertaining examples of "paraphrasing" using tortured sentences.

Original phrase "Paraphrased" phrase
statistically significant factually momentous
hierarchy pecking order
programming languages programming lingoes
code elements code rudiments
hyponyms / hypernyms lower words / upper words
implicit and explicit mechanisms implied and obvious devices
java [as a category name in a related work] java.lang.org
the returned category the pay back category
hypernym discovery problem hypernym unearthing problem
[DBpedia is a] crowd-sourced database crowdfunding database
overcome this lack of context get the better of this want of background
agglomerative hierarchical clustering framework agglomerated tiered grouping outline
environment concerns [as one type of information that Stack Overflow tags can provide] environmental issues
very low coverage incredibly near to the ground coverage
hypernymy and hyponymy relations top and bottom ratios
What Is This Technology [the expansion of the Witt acronym] What is Technology
the taxonomy of Wikipedia categories, and vice-versa the nomenclature of Wikpipedia groups, and contrarywise
consistently return a lower number of false positives unswervingly reimbursed an inferior figure of false positives
a large improvement of the false positive rate [i.e., a decrease] a significant increase in the false alarm rate
make it difficult to reliably analyze technology trends on discussion forums and other on-line venues prepare it hard to dependably evaluate automation courses on debate mediums as well as alternative networking platforms
looked at the tools described in the research articles and those used in the evaluation sections seen at the gears defined in the study and those castoff in the calculation segments
demonstrated better coverage than all evaluated alternative solutions, without a degradation in false positive rate established improved exposure compared to all gauged substitute methods, without conforming to dilapidation to false-positive rate
For a larger tag vocabulary on software project hosting websites such as Freecode, Wang et al. proposed a similarity metric to infer semantically related tags and build a taxonomy [40]. Used for software projects such as free codes to offer a larger label vocabulary on websites, Wang et al. An equality measure has been proposed to derive lexically correlated labels in addition establish a nomenclature [24].