Automated Feature Extraction from UML Images to Measure SOA Size
Munialo, S. W.
Muketha, Geoffrey M.
Omieno, K. K.
MetadataShow full item record
Enormous development has been experiences in the field of text and image extraction and classification. This is due to large amount of image data that is generated as a result of document sharing for collaborative software development and electronic storage of design documents. One of the recent technique for analyzing large dataset and discover underlying patterns is Deep learning technique. Deep learning is a branch of Machine learning inspired by human brain functionality for the purpose of analyzing unstructured data including images, sound and text. Unified Model Language (UML) is an architectural design which provides developers with a view of software components and scope. UML contain texts and notations which are mostly analyzed and interpreted manually for the purpose of system implementation and scope or size measurement. Consequently, manual processing of electronic design artifacts is prone to bias, errors and time consuming. Various researchers have attempted to automate the process of reading and interpreting design artifacts but still there is a challenge due to varying style of designing these artifacts. This study propose an automatic tool based on existing deep learning algorithms including ResNet50 CNN to read UML interface and sequence diagrams images to detect UML arrows, EAST test detector to detect text, Tesseract OCR with Long Short-Term Memory (LSTM) to recognize text and Multi-class Support Vector Machine to classify text for the purpose of measuring Service Oriented Architecture size. We subjected the tool to accuracy tests which returned encouraging results.