Skill requirements in job advertisements: A comparison of skill-categorization methods based on wage regressions
In this paper, we compare different methods to extract skill demand from the text of job descriptions. We propose the fraction of wage variation explained by the extracted skills as a novel performance metric for the comparison of methods. Using this, we compare the performance of the word-counting method with three different dictionaries and that of three unsupervised topic-modeling techniques, the LDA, the PLSA and the BERTopic. We apply these methods to a U.K. job board dataset of 1,158,926 job advertisements from 35 industries collected in 2018. We find that each of the dictionary-based methods explain about 20% of the wage variation across jobs. The topic modeling techniques perform better as the PLSA is able to explain 36.5% of the wage variation, while BERTopic 32.6%. The best performing method is the LDA with 48.3% of the wage variation explained. Its disadvantage, however, is in the difficulty of interpretation of the skills extracted.
Duke Scholars
Published In
DOI
ISSN
Publication Date
Volume
Issue
Related Subject Headings
- Information & Library Sciences
- 4610 Library and information studies
- 4609 Information systems
- 0807 Library and Information Studies
- 0806 Information Systems
- 0804 Data Format
Citation
Published In
DOI
ISSN
Publication Date
Volume
Issue
Related Subject Headings
- Information & Library Sciences
- 4610 Library and information studies
- 4609 Information systems
- 0807 Library and Information Studies
- 0806 Information Systems
- 0804 Data Format