the users to validate the top 10 skills per resume
ranked by relevancy scores. To measure recall, we ask
the users to add up to five skills that were missing
from the presented list.
The results show that the current skill tagging
framework attains 90 percent precision and 73 percent recall, which is better than the 82 percent precision and 70 percent recall of the previous version
(Zhao et al. 2015), as noted in Table 4. Moreover, a
strong correlation between the relevancy score and
the user approval rate is observed. Table 5 shows that
the higher the relevancy score, the higher the chance
of approval by users.
Technical Design Overview
Our projects require a successful synergy between
data scientists and data engineers to move from prototype to production. The SKILL system is one of the
first projects at CB that demonstrated successful collaboration between the two organizational teams.
Projects with a strong data science component initiate with the data scientists conducting R&D research
spikes to build prototypes to verify the feasibility of
business ideas. The data engineers are involved in
technical design discussions as soon as it becomes
evident that the prototype can move to production.
The synergy between the two teams is critical because
it is important to understand the limitations of open-source tools used by data scientists in production
environments. These limitations sometimes influence the tools used by the data scientists, but in general, we do not place hard restrictions during the prototyping phase.
For the SKILL system, the goal was to provide a
skill tagging service (The SKILL service) as a microser-
Table 3. List of Tagged Skills of the Resume Sample Presented in Table 2.
Raw Term Normalized Term Relevancy Score Type
software development Software Development . 95 Hard Skill
machine learning Machine Learning . 95 Hard Skill
automated testing Test Automation . 93 Hard Skill
Android Android (Operating System) . 92 Hard Skill
java Java (Programming Language) . 90 Hard Skill
python Python (Programming Language) . 88 Hard Skill
mobile app Mobile App . 87 Hard Skill
unix Unix . 85 Hard Skill
api Application Programming Interface . 85 Hard Skill
biomolecular engineering Biomolecular Engineering . 84 Hard Skill
chemical engineering Chemical Engineering . 84 Hard Skill
mandarin chinese Mandarin Chinese Language . 82 Hard Skill
Figure 2. Sample Documents Representing
the Two Senses of the NLP Skill Term.
Each colored box represents a set of related skill entities extracted from
resumes or job postings.
Table 4. Skill Tagging Comparison
Between Versions: Precision and Recall.
Version Precision Recall
Old 82% 70%
Current 90% 73%