Designing Of Enhanced Job Recommendation System For Online Job Hunting
Table of contents
- Introduction
- Literature Survey
- Conclusion
We address the problem of recommending suitable jobs to people who are seeking a new job. We formulate this recommendation problem as a supervised machine learning problem. Our technique exploits all past job transitions as well as the data associated with employees and institutions to predict an employee’s next job transition. Dealing with the enormous amount of recruiting information on the Internet, a job seeker always spends hours to find useful ones. To reduce this laborious work, we design and implement a recommendation system for online job hunting. In this paper, we contrast user-based and item-based collaborative filtering algorithm to choose a better performed one. We also take background information including students’ resumes and details of recruiting information into consideration, bring weights of co-apply users (the users who had applied the candidate jobs) and weights of student used liked jobs into their commendation algorithm. At last, the model we proposed is verified through experiments study which is using actual data. The recommended results can achieve higher score of precision and recall, and they are more relevant with users’ preferences before.
Introduction
The increase in usage of Internet has heightened the need for online job hunting. According to Job site’s report 2014, 68% of online job seekers are college graduates or post graduates. The key problem is that most of the job-hunting websites just display the recruitment information to website viewers. Students have to go through all the information to find the jobs they want to apply. The whole procedure is tedious and inefficient. We need an easy job recommendation system where everyone will have a fair and square chance. This saves a lot of potential time and money both, on the industrial as well as the job seeker’s side. Moreover, as the candidate gets a fair chance to prove his talent in the real world it is a lot more efficient system. The basic agenda of every algorithm used in today’s world, be it a traditional algorithm or a hybrid algorithm, is to provide a suitable job that the user actually seeks and wishes for.
Recently, job recommendation has attracted a lot of research attention and has played an important role on the online recruiting website. Different from traditional recommendation systems which recommend items to users, job recommender systems (JRSs) recommend one type of users (e.g., job applicants) to another type of users (e.g., recruiters). In particular, job recommender system is designed to retrieve a list of job descriptions to a job applicant based on his/her preferences or to generate a list of job candidates to a recruiter based on the job requirements. To obtain a good recommendation results, many recommendation approaches are presented and applied in the JRS. Typically, given a user, existing JRSs employ a specific recommendation approach to generate a ranked list of jobs/candidates. However, different users may have different characteristics and a single recommendation approach may not be suitable for all users. Therefore, a high-quality JRS should have the capability of choosing the appropriate recommendation approaches according to the user’s characteristic.
Literature Survey
The JRS has been studied from many aspects. AlOtaibi et al. summarized the categories of existing online recruiting platforms and listed the advantages and disadvantages of technical approaches in different JRSs. For example, bidirectional recommendation is accomplished but only binary representation is allowed in the probabilistic hybrid approach. We also had done some research on feature extraction, resume mining, recommendation approach, ranking, and explanation for the JRS. In our previous work, user profiling and calculating similarity are presented as the prevailing process of a JRS, and the architecture and product features are briefly discussed.From the technical perspective, JRS has been classified into five categories described as follows:
a) Content-based Recommendation (CBR): The principle of a content-based recommendation is to suggest items that have similar content information to the corresponding users. For example, in the recommendation that recommends jobs to a job applicant, the content is the personal information and their job desires. While recommending candidates to recruiters, the job description posted by recruiters, including the background description of enterprises, are used as the content for recommendation. The basic process of content-based recommendation is acquiring the content information of job applicants and jobs and calculating their similarities. So, the content information plays an important role in the content-based recommendation. Yu et al. presented a cascaded extraction approach for resumes to obtain the more effective information. Yi et al. built a relevance-based language model – Structured Relevance Models for modeling and retrieving semi-structured documents. Furthermore, Paparrizos et al. trained a machine learning model to predict candidates’ next job transition based on their past job histories as well as the data of both candidates and enterprises in the web.
b) Collaborative Filtering Recommendation (CFR): Collaborative filtering recommendation, known as the user-to-user correlation method, finds similar users who have the same taste with the target user and recommends items based on what the similar users like. The key step in CFR is computing the similarities among users. Collaborative filtering recommendation algorithm can be classified into memory-based and model-based. In the memory-based collaborative filtering recommendation, a user-item rating matrix is usually used as the input. Applied in the job recruiting domain, some user behaviors or actions can generate the user-item rating matrix according to the predefined definitions and transition rules. Färber et al. presented an aspect model to produce a rating matrix that assigns assessed values to candidate’s profile using the Expectation Maximization (EM) algorithm. Collaborative Filtering works by building a database of preferences for items by users. A new user, John, is matched against the database to discover neighbors, which are other users who have historically had similar taste to John. Items that the neighbors like are then recommended to John, as he will also probably like them.
For the purpose of subsequent discussion, we assume that the user-item ratings matrix is an incomplete m × n matrix R = [ruj ] containing m users and n items. It is assumed that only a small subset of the ratings matrix is specified or observed. Like all other collaborative filtering algorithms, neighborhood-based collaborative filtering algorithms can be formulated in two ways:
- Predicting the rating value of a user-item combination: This is the simplest and most primitive formulation of a recommender system. In this case, the missing rating rujof the user u for item j is predicted.
- Determining the top-k items or top-k users: In most practical settings, the merchant is not necessarily looking for specific ratings values of user-item combinations. Rather, it is more interesting to learn the top-k most relevant items for a particular user, or the top-k most relevant users for a particular item. The problem of determining the top-k items is more common than that of finding the top-k users. This is because the former formulation is used to present lists of recommended items to users in Web centric scenarios. In traditional recommender algorithms, the “top-k problem” almost always refers to the process of finding the top-k items, rather than the top-k users. However, the latter formulation is also useful to the merchant because it can be used to determine the best users to target with marketing efforts.
Step 1: we assume that the ratings matrix is denoted by R, and it is an m × n matrix containing m users and n items. Therefore, the rating of user u for item j is denoted by ruj. Only small subsets of the entries in the ratings matrix are typically specified. The specified entries of the matrix are referred to as the training data, whereas the unspecified entries of the matrix are referred to as the test data. There are two basic principles used in neighborhood-based models:
- User-based models: Similar users have similar ratings on the same item. Therefore, if Alice and Bob have rated jobs in a similar way in the past, then one can use Alice’s observed ratings on the job Software Engineer to predict Bob’s unobserved ratings on this job.
- Item-based models: Similar items are rated in a similar way by the same user. Therefore, Bob’s ratings on similar software engineer jobs like web developer can be used to predict his rating on software engineer.
Step 2: In this approach, user-based neighborhoods are defined in order to identify similar users to the target user for whom the rating predictions are being computed. In order to determine the neighborhood of the target user i, her similarity to all the other users is computed. Therefore, a similarity function needs to be defined between the ratings specified by users. Such a similarity computation is tricky because different users may have different scales of ratings. One user might be biased toward liking most items, whereas another user might be biased toward not liking most of the items. Furthermore, different users may have rated different items. Therefore, mechanisms need to be identified to address these issues.
For the m× n ratings matrix R = [ruj ] with m users and n items, let Iu denote the set of item indices for which ratings have been specified by user (row) u. For example, if the ratings of the first, third, and fifth items (columns) of user (row) u are specified (observed) and the remaining are missing, then we have Iu = {1, 3, 5}. Therefore, the set of items rated by both users u and v is given by Iu ∩ Iv. For example, if user v has rated the first four items, then Iv = {1, 2, 3, 4}, and Iu ∩ Iv = {1, 3, 5} ∩ {1, 2, 3, 4} = {1, 3}. It is possible (and quite common) for Iu ∩ Iv to be an empty set because ratings matrices are generally sparse. The set Iu ∩ Iv defines the mutually observed ratings, which are used to compute the similarity between the uth and vth users for neighborhood computation. One measure that captures the similarity Sim (u, v) between the rating vectors of two users u and v is the Pearson correlation coefficient. Because Iu∩Iv represents the set of item indices for which both user u and user v have specified ratings, the coefficient is computed only on this set of items.
Strictly speaking, the traditional definition of Pearson (u, v) mandates that the values of μu and μv should be computed only over the items that are rated both by users u and v. Unlike Equation 2.1, such an approach will lead to a different value of μu, depending on the choice of the other user v to which the Pearson similarity is being computed. However, it is quite common (and computationally simpler) to compute each μu just once for each user u, according to Equation 2.1. It is hard to make an argument that one of these two ways of computing μu always provides strictly better recommendations than the other. In extreme cases, where the two users have only one mutually specified rating, it can be argued that using Equation 2.1 for computing μu will provide more informative results, because the Pearson coefficient will be indeterminate over a single common item in the traditional definition. Therefore, we will work with the simpler assumption of using Equation 2.1 in this chapter. Nevertheless, it is important for the reader to keep in mind that many implementations of user-based methods compute μu and μv in pair wise fashion during the Pearson computation.
c) Knowledge-based Recommendation (KBR): In the knowledge-based recommendation, rules and patterns obtained from the functional knowledge of how a specific item meets the requirement of a particular user, are used for recommending items. For example, employees who have one or more years of work experience exhibit better performance as compared to those without experience. This can be used as a job performance rule in the online recruiting. Chien et al. developed a data mining framework based on decision tree and association rules to generate useful rules for selecting personnel feature and enhancing human capital. In addition, other types of knowledge such as ontology can also be used in the job recommendation. Lee and Brusilovsky employed an ontology checker to match information with ontology and perform the classification in the JRS.
d) Reciprocal Recommendation (ReR): Firstly proposed by Luiz Pizzato et al., reciprocal recommender is a special kind of recommender systems. The preferences of all the users are taken into account and need to be satisfied at the same time. As a result, ReR achieves a win-win situation for users and improves the accuracy of recommender systems that match people and jobs. Yu et al. proposed a similarity calculation method for calculating the reciprocal value and achieving the reciprocal recommendation based on the explicit preferences obtained from users’ resumes and the implicit preferences acquired from the user’s interaction history. Malinowski et al. also used a bilateral recommendation approach which considers the two parts of JRS to match the job applicants and jobs. Li et al. proposed a generalized framework for reciprocal recommendation that is applied to online recruiting, in which they model the correlations among users by a bipartite graph.
e) Hybrid Recommendation (HyR): All recommendation approaches mentioned above have their limitations. To overcome the limitation, these approaches have been integrated to obtain better performance. Burke presented seven categories of the hybrid recommender system as follows: weighted, switching, mixed, feature combination, cascade, feature augmentation, and model. Malinowski et al. applied the probabilistic model to two parts of JRS: a CV-recommender and a job recommender separately and integrate the result in order to improve the match between job applicants and jobs. Keim [20] integrated the prior research into a unified multilayer framework supporting the matching of individuals for recruitment and team staffing processes.
Fazel-Zarandi and Fox combined different matchmaking strategies in a hybrid approach for matching job applicants and jobs by using logic-based and similarity-based matching.
Conclusion
On the basis of this study and various techniques to research and after implementation of algorithms the CF based algorithm is considered for its better performance and overall factors. Of course a lot of improvement and hybrid algorithms need to be implemented alongside CF algorithm. To further optimize the recommendation system, and integrate the system for better performance we keep in check the sparsity of user profile and use some methods for filling user’s preference matrix and how it can be utilized.
Cite this Essay
To export a reference to this article please select a referencing style below