0% found this document useful (0 votes)
49 views54 pages

Kerala to Sweden: Migrant Job Alignment

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views54 pages

Kerala to Sweden: Migrant Job Alignment

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

FROM KERALA TO SWEDEN: ASSESSING JOB

QUALIFICATION ALIGNMENT AND UNRAVELING


THE MOTIVATIONS, EXPERIENCES, AND
CHALLENGES OF MIGRANTS

Submitted by
Sukanya Thayyil Sunilkumar

A thesis submitted to the Department of Statistics


in partial fulfilment of the requirements
for a one-year Master of Science degree in Statistics
in the Faculty of Social Sciences

Supervisor
Per Johansson

Spring, 2024
ABSTRACT

This study investigates job-qualification alignment among Kerala immigrants in Sweden,


focusing on predicting job matches, understanding migration reasons, and exploring integration
challenges. A pilot survey was conducted, drawing on field knowledge and discussions with
Malayali communities in different parts of Sweden. The survey was designed using Google
Forms, primarily with multiple-choice questions to ensure ease and quickness for respondents.
Multiple imputations were applied to address missing data. Ethical considerations were prior-
itized by avoiding collecting personal information and allowing respondents to skip questions,
add their options, or freely share their opinions through the "Other" option. Due to the lack
of a sampling frame for the population, non-random sampling methods, including judgmental
and snowball sampling, were used. Information from the Indian embassy, Malayali organiza-
tions, WhatsApp groups, and Facebook groups was gathered to facilitate data collection. These
methods allowed for the targeted collection of data, providing a narrative of the experiences of
Kerala immigrants in Sweden. However, these method’s limitations are that the results may be
hard to generalize to the Kerala population as no information on the overall population exists.
Future research should examine broader migration trends and use methods that track changes
over time to understand long-term integration better.
Acknowledgments
I express my deepest gratitude to my supervisor, Per Johansson, for his invaluable guidance,
patience, and expert advice throughout this research. His insights and expertise have been
instrumental in defining this study’s direction and outcome.
I am immensely thankful to Mattias Nordin for his invaluable support as a course coor-
dinator throughout the entire study period. My sincere appreciation also goes to Yukai Yang,
Johan Lyhagen, and Rauf Ahmad, whose valuable classes have significantly contributed to
my ability to write this thesis.
I extend my gratitude to all the lecturers in the Department of Statistics at Uppsala
University, as well as the guest lecturers who participated in the Wednesday seminars, for
providing me with comprehensive knowledge in Statistics during my time at Uppsala Univer-
sity.
A special thanks to the entire Kerala community in Sweden, especially those who partic-
ipated in, shared information for, and supported this data collection. Your contributions have
been crucial to the success of this research.
I am extremely grateful to My family for their unwavering support. To my dearest late fa-
ther Sunilkumar, my mother, Santhakumary, my brother, Arunkumar, my daughter, Janaki,
my husband, Sarath Pallathery Ramachandran, and my in-laws Ramani and Ramachan-
dran; thank you for your love, encouragement, and the sacrifices you have made to help me
reach this point. Your support has been my pillar of strength and motivation.
I would also like to express my appreciation to all my previous teachers, especially Maya
Miss, Venu Sir, and Smitha Miss, whose early guidance and inspiration set me on this path.
To my Friends and Organisations that I am part of, thank you for your constant encour-
agement and belief in me.
Above all, I am thankful to God for the blessings, strength, and opportunities provided
throughout this journey.
To all of you who have supported me in different ways on this path: your faith in me, your
endless love, your critical words, your hate, your silence and your sacrifices have made this
achievement possible. Thank you so much all dears.

2
Contents
1 Introduction 7

2 Literature Review 9
2.1 Existing Research on Migration . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Global and Indian Context . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.2 Theoretical Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Job Qualification Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Definition and Importance . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Previous Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Challenges in Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.1 Common Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3.2 Specific Challenges in Sweden . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Sampling Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.1 Data Collection Methods . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4.2 Online Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Methodology 10
3.1 Research Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.1 Sampling Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.2 Sampling Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Data Collection Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.1 Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Data Cleaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.1 Standardization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.2 Handling Missing Data . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4 Ethical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.1 Informed Consent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.2 Confidentiality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4 Descriptive Statistics 15
4.1 Bar Plots Explanations for Variables (Appendix F) . . . . . . . . . . . . . . . 15

3
4.2 Summary of Statictics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Analysis of Statistical Modeling 21


5.1 Appropriate Dependent Variable . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 Absence of Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3 Cross-tabulation of Dependent-Independent variables for Modelling Variables . 23
5.3.1 Job Match by Gender . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.3.2 Job Match by Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3.3 Job Match by Arrival Year . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3.4 Job Match by County in Sweden . . . . . . . . . . . . . . . . . . . . . 25
5.3.5 Job Match by Education Level . . . . . . . . . . . . . . . . . . . . . . 26
5.3.6 Job Match by Educational Background . . . . . . . . . . . . . . . . . 26
5.3.7 Job Match by Employment Status . . . . . . . . . . . . . . . . . . . . 27
5.4 Binary Logistic Regression and Model Selection . . . . . . . . . . . . . . . . 28
5.4.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.4.2 Model Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6 Discussion 33
6.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.3 Limitations and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

References 36

Appendix 38

A Base Level and Unimportant Variables in LASSO 38

B Model Implementation 39

C Performance Metrics Calculation 39

D Multicollinearity 40

E LASSO Coefficient Paths for Various Predictors 41

F Survey Questions and Corresponding Bar Plots 42

4
G Missing Data Pattern and Imputation 51

List of Figures
1 LASSO Coefficient Paths for Various Predictors. . . . . . . . . . . . . . . . . 41
2 1. Gender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3 2. Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4 3. Why did you move from Kerala to Sweden? . . . . . . . . . . . . . . . . . . 42
5 4. When did you arrive in Sweden? . . . . . . . . . . . . . . . . . . . . . . . . 42
6 5. Did you move to Sweden alone or with family? . . . . . . . . . . . . . . . . 43
7 6. What is your current visa status? . . . . . . . . . . . . . . . . . . . . . . . . 43
8 7. How did you apply for your visa the first time? . . . . . . . . . . . . . . . . 43
9 8. Did you have an interview for the first-time visa process? . . . . . . . . . . 43
10 9. What is your highest level of education? . . . . . . . . . . . . . . . . . . . . 44
11 10. What is your educational background? . . . . . . . . . . . . . . . . . . . . 44
12 11. Do you have a job? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
13 12. Does your current job match your qualifications? . . . . . . . . . . . . . . 45
14 13. In your job (part-time or full-time), are you receiving a salary or amount as
specified by Swedish regulations? . . . . . . . . . . . . . . . . . . . . . . . . 45
15 14. How satisfied are you with the job you have now? . . . . . . . . . . . . . . 45
16 15. What do you think are the main problems in getting a job in Sweden? . . . 46
17 16. Are you satisfied with medical care in Sweden? . . . . . . . . . . . . . . . 46
18 17. Have you experienced any health issues after moving to Sweden? . . . . . 47
19 18. How much do you care about your mental and physical health after moving
to Sweden? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
20 19. How satisfied are you with Swedish food culture compared to Kerala food
culture? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
21 20. How satisfied are you with the amount of personal or family time you have
after moving to Sweden? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
22 21. How satisfied are you with your life after moving to Sweden? . . . . . . . . 48
23 22. Which district are you from in Kerala? . . . . . . . . . . . . . . . . . . . . 48
24 23. Which county are you currently residing in Sweden? . . . . . . . . . . . . 49

5
25 24. How would you rate cultural integration with Swedish society? . . . . . . . 49
26 25. Have you faced any challenges after moving to Sweden? If yes, how often
have you faced these challenges within the first year you came? . . . . . . . . . 50
27 26. What type of accommodation do you have? . . . . . . . . . . . . . . . . . 50
28 27. What is your rent range per month? . . . . . . . . . . . . . . . . . . . . . 50
29 28. Are you a parent? If yes, how satisfied are you with parental benefits in
Sweden? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
30 29. How much knowledge did you have about Sweden before moving? . . . . . 51
31 Missing Model Data Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
32 Missing Data Pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

List of Tables
1 Cross-tabulation of Job_Match and Gender . . . . . . . . . . . . . . . . . . . 23
2 Cross-tabulation of Job_Match and Age . . . . . . . . . . . . . . . . . . . . . 24
3 Cross-tabulation of Job_Match and Arrival_Year . . . . . . . . . . . . . . . . 25
4 Cross-tabulation of Job_Match and County_Sweden . . . . . . . . . . . . . . . 25
5 Cross-tabulation of Job_Match and Education_Level . . . . . . . . . . . . . . 26
6 Cross-tabulation of Job_Match and Educational_Background . . . . . . . . . . 27
7 Cross-tabulation of Job_Match and Has_Job . . . . . . . . . . . . . . . . . . . 27
8 Positive Coefficients in LASSO . . . . . . . . . . . . . . . . . . . . . . . . . . 30
9 Negative Coefficients in LASSO . . . . . . . . . . . . . . . . . . . . . . . . . 31
10 Confusion Matrix: Cross-tabulation of Predicted vs. Actual Job Match . . . . . 32
11 Performance Metrics for Lasso Logistic Regression Model . . . . . . . . . . . 33
12 Base Level Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
13 Unimportant Variables in LASSO . . . . . . . . . . . . . . . . . . . . . . . . 38
14 Variance Inflation Factors (VIF) for Independent Variables . . . . . . . . . . . 40
15 Table with Highlighted Imputed Values in the Model Data . . . . . . . . . . . 52

6
1 Introduction
The experience of moving from Kerala, a state in India, to Sweden has raised many questions
about both the pros and cons of migration. The primary challenge in addressing these questions
was the lack of data; no data exists specifically on people from Kerala residing in Sweden. To
address this issue, a survey was designed, and data were collected from individuals of Kerala
origin, enabling answers to some of these questions that may also be of interest to others.
As people from Kerala speak Malayalam, they are generally called as Malayalies. In this
study, the terms "People from Kerala" and "Malayalies" will be used interchangeably. Malay-
alies have a long history of migrating to different parts of the world. There is a popular joke
in Kerala that says, "If you go to the moon, you will see a Malayali running a coffee shop
there." While intended as a joke, this illustrates how far and wide Malayalies have spread, even
imagining them on the moon. Migration occurs for a variety of reasons, with Sweden being
perceived by many as a land of opportunities. However, there are also common beliefs within
the community that finding a job in Sweden, particularly part-time employment, is challeng-
ing. This study seeks to confirm or dispel such beliefs and examine whether the jobs Malayali
migrants obtain align with their skills and qualifications. Additionally, the study aims to un-
derstand the motivations, experiences, and challenges faced by Malayalies who have moved to
Sweden. Three main objectives are outlined in this study:

1. Determine whether the jobs that Kerala migrants obtain in Sweden align with their qual-
ifications. The feasibility of modelling this data to predict job-qualification alignment,
identifying key influencing factors, and assessing the potential accuracy of the model will
be explored.

2. Investigate the reasons behind their decision to migrate to Sweden.

3. Explore their experiences and the challenges they encounter in the process of moving to
and integrating into Swedish society.

One major challenge in data collection was the absence of a sampling frame for the Ker-
ala population living in Sweden, making it impossible to use a probability sampling method.
Instead, Judgmental sampling (Perla and Provost, 2012) and Snowball sampling (Goodman,
1961), both non-probability sampling methods, were employed. Judgmental sampling involved

7
using prior knowledge of Kerala individuals in Sweden to gather data, while in snowball sam-
pling, current participants helped recruit future participants from their networks. This approach
allowed access to groups and communities that might otherwise be difficult to reach. This
mixed non-sampling method likely resulted in a higher response rate than would have been
achieved using random sampling. Additionally, the method is expected to provide more re-
liable responses than those from a random sample. However, it is important to note that the
results may not be fully generalizable to the Kerala population living in Sweden. A total of 716
responses were received. While no exact figure exists for the number of Malayalies in Sweden,
an estimate based on survey responses suggests that approximately 2,000-3,000 Malayalies re-
side in the country. This estimate was obtained by aggregating information from Malayalies
living in various counties, Facebook groups, WhatsApp groups, and the Indian Embassy in
Sweden. Approximately 25-35% of the Malayali population in Sweden responded to the sur-
vey, suggesting that the responses may provide a representative picture of the experiences of
Malayalies in Sweden.
The survey yielded several interesting findings. Immigration from Kerala to Sweden began
before 2000 and includes individuals of all ages, including pensioners, living in various types of
housing, with apartments being the most common. The sample also showed a higher proportion
of women than men, with Vastra Gotaland County hosting the largest Malayali community in
Sweden. An increasing trend of migration from Kerala to Sweden was observed, and most
respondents reported being satisfied with their lives in Sweden.
The following sections of this paper will delve deeper into various aspects of the study.
The Literature Review will explore existing research on migration, job-qualification align-
ment, challenges faced by migrants, and sampling methods. The Methodology section will
describe the survey design, data collection processes, data cleaning and analysis methods, and
ethical considerations. Descriptive Statistics will present visualizations and initial interpre-
tations of the survey data. In Analysis of Statistical Modeling, the LASSO model and other
statistical techniques will be used to analyze job alignment, with results explaining the impact
of coefficients and interpreting findings from the statistical models, as well as assessing model
performance. Finally, the Discussion will summarize the results, implications, limitations, and
challenges of the study.

8
2 Literature Review

2.1 Existing Research on Migration

2.1.1 Global and Indian Context

Migration has been a significant area of research globally, with various studies focusing on the
economic, social, and cultural impacts of migration. Research by Cassarino (2004) and Larsson
(2024) highlights the complexities of modern migration patterns and their implications for both
sending and receiving countries. The migration from Kerala, India, to Sweden, is a relatively
underexplored area.

2.1.2 Theoretical Frameworks

Various theoretical frameworks have been used to study migration, including neoclassical eco-
nomic theories, push-pull models, and transnationalism. For example, Lee (1966) introduced
the push-pull theory, which remains a foundational concept in migration studies.

2.2 Job Qualification Alignment

2.2.1 Definition and Importance

Job qualification alignment refers to the degree to which migrants’ qualifications and skills
match the demands of the labour market in the host country. Chiswick and Miller (2003)
emphasizes the importance of this alignment for both economic integration and job satisfaction
among migrants.

2.2.2 Previous Studies

Previous studies have shown mixed results regarding the job qualification alignment among
migrants. While some studies, like Piracha and Vadean (2013), suggest that migrants often
face a mismatch, others, such as Bratsberg et al. (2002), indicates that over time, migrants tend
to find jobs that better match their qualifications.

9
2.3 Challenges in Migration

2.3.1 Common Challenges

Migrants face a variety of challenges, including language barriers, cultural differences, and
legal issues. According to Martin (2009), these challenges can be exacerbated during economic
downturns, which affect migrants’ ability to find employment.

2.3.2 Specific Challenges in Sweden

In Sweden, migrants often struggle with integrating into the labour market due to stringent qual-
ification recognition processes and language requirements (Lundborg and Skedinger, 2013).
These challenges are particularly pronounced for non-European migrants.

2.4 Sampling Methods

2.4.1 Data Collection Methods

Data collection in migration studies typically involves both qualitative and quantitative meth-
ods. Surveys, interviews, and administrative data are common sources. Groves et al. (2011)
provides an in-depth discussion on survey methodology, which is often employed in migration
research.

2.4.2 Online Data Collection

With the advent of digital technologies, online data collection methods have become increas-
ingly popular. Evans and Mathur (2009) discusses the advantages and limitations of online
surveys, particularly in reaching migrant populations.

3 Methodology

3.1 Research Design

3.1.1 Sampling Design

The sampling design was carefully structured to collect relevant data from Malayalies in Swe-
den. Initially, a draft questionnaire was created based on field knowledge and the questions

10
that commonly arise when considering migration to Sweden. Additionally, the experiences of
students admitted to Uppsala University in recent years were drawn upon.
To modify this draft, feedback was gathered from Malayalies both inside and outside Swe-
den, representing various age groups and sectors. Their insights were used to improve the ques-
tions and address the research concerns effectively. Following this, a pilot study questionnaire
was developed, and 11 Malayalies currently residing in Sweden were invited to participate. A
diverse group was selected among these participants to ensure a wide range of feedback was
obtained.
Based on the feedback from this pilot study, adjustments were made, including the addition
of new questions, the removal of some, and the reordering of others. The final questionnaire
consists of 9 sections with a total of 30 questions:

1. Personal Information - 2 questions (In Appendix F, figures 2 and 3 are the questions
with responses bar plots.)

2. Migration Details - 6 questions (In Appendix F, figures 4 to 9 are the questions with
responses bar plots.)

3. Education and Employment - 7 questions (In Appendix F, figures 10 to 16 are the


questions with responses bar plots.)

4. Health and Satisfaction - 6 questions (In Appendix F, figures 17 to 22 are the questions
with responses bar plots.)

5. Geographical and Social Integration - 4 questions (In Appendix F, figures 23 to 26 are


the questions with responses bar plots.)

6. Accommodation - 2 questions (In Appendix F, figures 27 and 28 are the questions with
responses bar plots.)

7. Parental and Health Benefits - 1 question (In Appendix F, the figure 29 is the question
with responses bar plot.)

8. Pre-Migration Knowledge - 1 question (In Appendix F, the figure 30 is the question


with responses bar plot.)

9. Consent for Data Collection - 1 question (30. Do you consent to the collection and use
of your data for this survey?)

11
Out of the 30 questions, 27 are multiple-choice, while 3 (In Appendix F, the figures 4,
16, and 18 are the questions.) allow for multiple responses. All questions are mandatory, but
participants have the option to skip a question using choices like "Not preferred to say", "Do
not want to answer", or "Do not want to answer/Do not know". Each question also includes an
’Other’ option, allowing participants to provide responses that may not have been covered in the
listed options, ensuring that everyone has the freedom to accurately represent their situation.
Any Missing data1 in the responses will be due to an intentional choice not to answer, as
indicated in the questionnaire.
To make the data collection more accessible and reduce misunderstandings, the question-
naire was provided in English with Malayalam2 translations in parentheses. This approach
helps participants better understand the questions and ensures more accurate responses.

3.1.2 Sampling Methods

As there was no sampling frame of Malayalies in Sweden, a probability sampling method could
not be used for data collection. Instead, non-probability sampling methods such as Judgmen-
tal sampling (Perla and Provost (2012)) and Snowball sampling (Goodman (1961)) were em-
ployed. Knowledge of this field and connections with Kerala immigrants in Sweden were valu-
able in collecting and distributing the survey. These methods are often considered to yield more
accurate results than random sampling in this context. In Judgmental sampling, knowledge was
used to select the most suitable participants for the study. Snowball sampling allowed current
participants to recruit others from their networks, facilitating access to groups and communi-
ties that might otherwise be difficult to reach. By combining these methods, more responses
were gathered, targeting the most relevant individuals and expanding the reach through their
connections.

3.2 Data Collection Process

3.2.1 Distribution

The survey was created using Google Forms and distributed through various online platforms
and social networks, including Email, Facebook, Instagram, WhatsApp, and LinkedIn. This
1
Responses marked as "Not preferred to say", "Do not want to answer", or "Do not want to answer/Do not
know" are considered as Missing data.
2
Malayalam is the native language of Kerala

12
method was chosen to reach as many participants as possible and to make it convenient for
them to complete the survey at their own pace, using their preferred devices.

3.2.2 Challenges

Several challenges were encountered during data collection, including few respondents and in-
complete responses. The data collection period was initially planned for two weeks. However,
after the first week, it was realized that the number of responses was too few to be meaningful
for analysis. To address this issue, the Google Form link was recirculated with a video request
instead of a written request, which resulted in doubling the response rate.
Despite this improvement, the desired number of responses had not been reached by the
end of the two-week period. To gather more data, the collection period was extended by an
additional week, and follow-up reminders were sent.
Initially, email addresses were collected in the survey, but it was observed that some re-
spondents were hesitant to share personal details. Consequently, the requirement to provide an
email address was removed, making the survey simpler and more attractive to participants. As
a result, the total number of respondents reached N = 716.
In the judgmental sampling, an attempt was made to reach at least one person in every
county. However, some counties, such as Blekinge, Jamtland, Kalmar, Norrbotten, and Varm-
land, were not reached. It remains unclear whether this was due to the absence of Malayalies
in these counties, the survey not being seen in time, or respondents being unwilling to partic-
ipate. It is known that Malayalies reside in Kalmar County, but no responses were received
from there. This was one of the challenges faced during data collection.

3.3 Data Cleaning

3.3.1 Standardization

All responses were converted into categorical factors. Responses marked as "Not preferred
to say," "Do not want to answer," or "Do not want to answer/Do not know" were treated as
missing data to improve the model. These responses were considered missing because they
didn’t provide useful information about the questions being studied. Treating them as missing
helps prevent the model from being affected by answers that don’t add any real value, which
could lead to incorrect or biased results. By handling these responses as missing, the analysis

13
can focus on the answers that do provide useful data, leading to better predictions and a clearer
understanding of the survey results.
In the "Other" option, respondents were provided space to write comments, which was
helpful in understanding their exact situations. For analysis purposes, these comments were
converted into appropriate categories or left as "Other" when necessary. Some comments were
explanations of existing categories, while others offered new opinions. The new opinions were
incorporated into relevant categories for analysis.

3.3.2 Handling Missing Data

Missing data were identified in the 27 multiple-choice questions, as shown in Figure 32 (Ap-
pendix G), and in the modelling data, as shown in Figure 31 (Appendix G). To handle missing
3
data in the subsequent multivariate modelling, Multiple Imputations were used through the
"MICE" package in R programming software, as discussed in Zhang (2016). The results of this
imputation are presented in Table 15 (Appendix G). It should be noted that some categories in
the table are hidden to protect the privacy of the responses.
Multiple Imputation creates several different plausible datasets, and in the analysis, the
results from each one are combined to obtain the correct inference.

3.4 Ethical Considerations

3.4.1 Informed Consent

Participants were informed about the purpose of the study, what their involvement would be,
and how their data would be used. Informed consent was obtained before participants com-
pleted the survey, making sure they understood that their participation was completely volun-
tary.

3.4.2 Confidentiality

Strong steps were taken to protect participant’s privacy. No personal information was collected
in the survey, ensuring that responses remain completely anonymous. The data is presented
3
Multiple Imputation (MI) is a statistical technique used to handle missing data in datasets. Instead of filling in
missing values with a single estimate, MI replaces each missing value multiple times, creating several "complete"
datasets. These datasets are then analyzed separately, and the results are pooled to provide a more robust and
accurate estimate.

14
only in summary form, making it impossible to trace any answers back to an individual. All
data will be securely stored and accessible only to those directly involved in the study. These
precautions ensure that participant’s privacy is protected throughout the research.

4 Descriptive Statistics

4.1 Bar Plots Explanations for Variables (Appendix F)

Gender Figure 2 shows the distribution of respondents by gender. The majority are Female
(58.40%), followed by Male (40.80%). Only a small percentage identified as Other (0.10%),
and even fewer chose not to disclose their gender (0.70%).

Age Figure 3 shows the age distribution of respondents. The largest group is in the 33-37 age
range (35.60%), followed by those in the 28-32 range (31.10%) and 38-42 (16.80%). Younger
and older age groups are less represented, with the smallest groups being those aged 53-57 and
63-67.

Move Reason Figure 4 shows the reasons for moving to Sweden. The most common reason is
Job (46.40%), followed by Career growth (33.20%), Higher education (32.50%) and To settle
in Europe (29.30%). The least common reason was Family reunion (21.10%), and Schengen
visa (8.70%).

Arrival Year Figure 5 shows the distribution of respondents by the year they arrived in Swe-
den. The highest numbers of arrivals occurred in 2023 (26.50%) and 2022 (21.80%). This
suggests a recent surge in immigration. An increasing trend in arrivals is noticeable over the
years. The trend showed a strong and growing migration flow into Sweden in recent years.
The years 2020 and 2024 show lower numbers of arrivals compared to other years. The lower
number for 2020 can be due to the COVID-19 pandemic, which severely affected global travel
and immigration. The year 2024 is also a low number, due to the year is not yet completed.
The data shows a significant increase in the number of Malayalies arriving in Sweden. It may
be affected by snowball sampling but still, the trend may not be changed even if counts can
change.

15
Move With Family Figure 6 shows whether respondents moved alone or with family. A
large majority moved With family (75.80%), while a smaller portion moved Alone (24.00%).

Visa Status Figure 7 shows the visa statuses of respondents. The largest groups hold a De-
pendent visa (28.40%) and a Job visa (27.90%). Smaller proportions are Student (17.00%) and
Citizenship / Swedish passport (12.00%). The least common status are Permanent residence
(10.60%) and Job seeker (3.40%).

Visa Application Figure 8 shows who applied for the visa. The most common applicant was
Company (38.70%), followed by Yourself / Family Member (31.10%) and Agency (29.20%). A
small number did not specify (1.00%).

Visa Interview Figure 9 shows whether respondents had a visa interview. A majority did
not have an interview (70.30%), while some of them had interview (28.60%). A very small
percentage did not provide information (1.10%).

Education Level Figure 10 shows the highest level of education attained. Most respondents
have a Master’s degree (51.50%), followed by Bachelor’s degree (42.00%). PhD and above,
Higher secondary and Secondary are less common, with 5.00%, 1.10% and 0.30%, respec-
tively.

Educational Background Figure 11 shows the distribution of fields of study. The most
common field is Information Technology (IT) with 32.40%, followed by Electrical Engineer-
ing (13.10%) and Mechanical Engineering (9.90%). Fields such as Tourism and Hospitality,
Social Sciences, Human Resources Management, Architecture and Design and Agriculture and
Veterinary Science have the fewest participants, each with 0.60%. The least Education back-
ground is Environmental Studies with 0.30%.

Employment status Figure 12 shows the current employment status. The majority are Yes,
full-time employed (57.70%), followed by those Yes, part-time employed (20.10%). Job Search-
ing (15.10%), and Students are very few (6.70%). Pensionist (0.10%) is a rare case. There are
also some who did not specify their status (0.3%).

16
Job Match Figure 13 shows how well respondents feel their job matches their qualifications.
Most feel their job matches their qualifications (53.10%). However, 14.80% feel their job does
not match their qualifications, 9.40% feel Overqualified, and 0.70% feel Underqualified. A
significant portion (21.80%) finds the job match not applicable.

Salary Compliance Figure 14 shows participant’s awareness to Swedish salary regulations


and are they getting salary under the regulations. A majority 64%, are receiving their salary in
accordance with Swedish salary regulations. A smaller proportion, 7.30%, reported that they
are not receiving their salary according to Swedish regulations. This figure is not negligible,
even though relatively small compared to the majority, and it highlights some issues in the law
maintenance. 5.30% of participants are not sure about is they getting salary under regulations,
and 1.50% do not know about the regulations at all. However, 21.80% of the participants
marked the question as Not applicable, suggesting they are either not employed or their em-
ployment status does not involve salary regulation issues. The very small percentage of 0.10%
did not provide a response.

Job Satisfaction Figure 15 shows job satisfaction levels. The largest group is very satisfied
(28.10%), followed closely by those who are satisfied (27.80%). A significant portion, 21.80%,
find job satisfaction not applicable. Smaller percentages are very dissatisfied (1.50%) and
dissatisfied (5.30%).

Job Problems Figure 16 shows job-related challenges faced by respondents. The most im-
portant issue is the language barrier (79.30%), indicating significant difficulty in communica-
tion. This is followed by lack of professional network (38.50%), which affects career growth
and job searching. Limited job opportunities in my field is reported by 25.70% of participants,
highlighting constraints in job availability specific to their qualifications. Issues such as lack of
recognition of foreign qualifications and cultural differences are less common but still notable.

Medical Care Satisfaction Figure 17 shows the satisfaction levels with medical care. The
largest group feels neutral (39.00%) about the medical services they receive, suggesting a
mixed or moderate experience. A significant portion is dissatisfied (20.00%) and very dis-
satisfied (7.10%), indicating notable dissatisfaction with the medical care. Satisfied and Very

17
Satisfied are correspondingly 24.70% and 6.10%. A small percentage did not provide a re-
sponse (3.10%).

Health Issues Figure 18 displays reported health issues among participants. Vitamin deficien-
cies are the most common health problem, reported by 43.30% of respondents, highlighting a
prevalent concern. Depression affects 16.10% of participants, indicating a significant mental
health issue. Common health issues/body or part pains and allergies are less frequent. The Not
applicable category is the largest (37.40%), suggesting that many respondents do not experi-
ence significant health issues.

Health Care Satisfaction Figure 19 shows ratings of overall health care after moving to
Sweden. The majority find the care to be very much (24.00%), quite a lot (34.50%) and some-
what (27.50%), indicating a generally positive view. Smaller proportions rate it as not much
(11.30%), or not at all (2.40%).

Food Culture Satisfaction Figure 20 shows satisfaction with local food culture. The largest
group is neutral (45.70%), showing mixed feelings about the local food. Satisfied respondents
make up 23.60%, and a smaller portion is dissatisfied (14.50%) or very dissatisfied (6.60%). A
small fraction is very satisfied (8.40%).

Family Time Satisfaction Figure 21 shows satisfaction with time spent with family. The
majority feel very satisfied (43.60%) or satisfied (34.60%), indicating a strong positive expe-
rience with family time. Smaller segments are neutral (14.50%), dissatisfied (3.80%), or very
dissatisfied (2.10%).

Life Satisfaction Figure 22 indicates overall life satisfaction. Most respondents are satisfied
(46.90%) or very satisfied (26.50%), reflecting a generally positive outlook on life. Smaller
percentages are neutral (19.80%), dissatisfied (5.20%), and very dissatisfied (1.00%).

District of Kerala Figure 23 shows the distribution of participants from various districts
in Kerala. Ernakulam has the highest representation (20.50%), followed by Thrissur (12.20%)
and Kottayam (10.10%). The least represented districts are Idukki (1.80%), Kasaragod (1.40%)
and Wayanad (1.00%).

18
County in Sweden Figure 24 Shows the distribution of participants across Swedish coun-
ties. The most represented county is Vastra Gotaland County (33.90%), followed by Skane
County (15.20%) and Stockholm County (13.70%). The least represented counties are Gavle-
borg (0.30%), Vasternorrland (0.30%) and Orebro (0.10%).

Cultural Integration Figure 25 indicates participant’s perceptions of their cultural integra-


tion. The majority find it medium (50.70%), hard (29.20%), and very hard (11.50%) suggesting
a challenging integration process for many. Fewer participants find it Easy (5.40%) and Very
easy (0.80%).

Challenges Faced Figure 26 shows the distribution of challenges faced by the respondents.
The majority of participants reported experiencing challenges Sometimes (44.60%), while 22.20%
faced challenges Rarely. A significant portion also indicated they encountered challenges Of-
ten (18.30%). Those who faced challenges Very often constituted 7.10%, whereas only 6.30%
reported Never experiencing challenges. A small percentage of respondents (1.50%) did not
provide any response.

Accommodation Type Figure 27 shows the types of accommodation respondents are living
in. The majority reside in Apartments (64.80%), followed by those living in a House (15.10%).
Other types of accommodation include Shared living (5.60%) and various forms of Student
housing, such as Studio apartments (5.40%), Corridor rooms (4.20%), One-room apartment
(2.70%) and a Two or more room apartment (2.10%). A negligible number (0.10%) did not
specify their accommodation type.

Rent Range Figure 28 Shows the distribution of respondent’s rent ranges. The most common
rent ranges are 5,000-7,500 SEK and 7,500-10,000 SEK, each accounting for 18.90% of the
respondents. Another significant group pays between 10,000-12,500 SEK (17.00%). Fewer re-
spondents pay 12,500-15,000 SEK (9.90%) and Less than 5,000 SEK (9.60%). Smaller groups
pay 15,000-17,500 SEK (3.90%), 17,500-20,000 SEK (1.80%), and More than 20,000 SEK
(1.10%). Additionally, 15.40% of respondents reported Not Paying Rent, while 3.50% did not
provide a response. It is possible that some of the respondents who own homes and do not pay
rent might be paying loans instead. Those who are paying loans could be reflected in the higher
rent ranges or in the Not Paying Rent category.

19
Parental Status Figure 29 Shows the respondent’s satisfaction with their parental status. A
significant portion of respondents reported being Very satisfied (26.30%) and Satisfied (25.70%)
with their parental status. Another large group indicated that this question was Not applicable
to them (35.10%). Those who were Neutral made up 9.10% of the respondents. Only a small
percentage expressed being Dissatisfied (1.00%) and Very dissatisfied (0.30%), with 2.70% not
providing a response.

Sweden Knowledge Figure 30 shows the respondent’s level of knowledge about Sweden be-
fore coming to Sweden. The largest group described themselves as Moderately knowledgeable
(45.80%), followed by those who are Slightly knowledgeable (33.40%). A smaller segment of
respondents considered themselves Not knowledgeable at all (14.80%), while 5.60% felt they
were Very knowledgeable. A minimal percentage (0.40%) did not respond to this question.

4.2 Summary of Statictics

Of the total sample of N =716 Responders we can see that most participants are female, with
58.40% identifying as female and the majority being in the age range of 33 to 37 years. Many
people moved to Sweden mainly for job opportunities, with 46.40% choosing this reason. The
largest group of respondents arrived in 2023, indicating a recent increase in migration. Most
people moved to Sweden with their families (75.80%). Regarding visa status, the most common
were dependent visas (28.40%) and job visas (27.90%). The majority (38.70%) had their visa
applications handled by their companies, and most did not have a visa interview (70.30%).
In terms of education, more than half of the respondents hold a Master’s degree (51.50%),
with a significant number having a background in Information Technology (32.40%). Em-
ployment data shows that 57.70% of participants are employed full-time. Most respondents
(53.10%) believe their current job matches their qualifications, and 64.00% report receiv-
ing salaries in compliance with Swedish regulations. Job satisfaction is generally high, with
28.10% being very satisfied. However, language barriers are a significant issue, affecting
79.30% of participants.
When it comes to medical care, 39.00% of respondents feel neutral about the quality of care
they receive. Health issues such as vitamin deficiencies are common, reported by 43.30% of
participants. Overall, most people are satisfied with their healthcare experience, with 34.50%
rating it as "quite a lot". Opinions on local food culture are mixed, with 45.70% feeling neutral.

20
However, family time is a positive aspect for many, with 43.60% very satisfied with the time
they spend with their families.
Life satisfaction is high, with 46.90% of respondents being satisfied. The majority of re-
spondents come from the district of Ernakulam in Kerala (20.50%) and live in Vastra Gotaland
County in Sweden (33.90%). Cultural integration seems to be challenging for many, with
50.70% rating it as medium. Most participants face challenges sometimes, with 44.60% re-
porting occasional difficulties.
In terms of living conditions, 64.80% of respondents live in apartments, and the most
common rent ranges are between 5,000 and 10,000 SEK. Parental satisfaction is mixed, with
35.10% finding it not applicable to their situation. Finally, most people felt moderately knowl-
edgeable about Sweden before moving, with 45.80% rating their knowledge at this level.

5 Analysis of Statistical Modeling


In this section, the association between successful job matches and background factors will be
analyzed. Before analyzing "Job_Match" using Binary Logistic Regression with a large set
of categorical predictors, an examination of how "Job_Match" is associated with a variety of
factors will be conducted using simple two-by-two tables.
Out of the total N = 716 observations, 5584 individuals responded that they had a job. Out
of the 558 valid responses, 545 were complete cases (see Figure 31 in Appendix G). The re-
maining responses contained some missing data. To address this, Multiple Imputations (Zhang,
2016) was used to fill in the missing values. After imputation, a total of 558 complete observa-
tions were retained for analysis, as shown in Table 15 (Appendix G)

5.1 Appropriate Dependent Variable

Originally, "Job_Match" had five categories: "Not applicable," "Yes, it matches," "No, it doesn’t
match," "No, I am overqualified," and "No, I am underqualified." For our analysis, we excluded
the "Not applicable" responses, which left us with 558 valid responses out of 716.
4
Originally, "Job_Match" had five categories: "Not applicable," "Yes, it matches," "No, it doesn’t match,"
"No, I am overqualified," and "No, I am underqualified." For our analysis, we excluded the "Not applicable"
responses, which left us with 558 valid responses out of 716.

21
Since the counts for "No, it doesn’t match," "No, I am overqualified," and "No, I am under-
qualified" were very low compared to "Yes, it matches" (as shown in Figure 13), we combined
these three categories into a single "No" category. The "Yes, it matches" category was kept as
is. We coded "No" as 0 and "Yes, it matches" as 1.
By recording the responses in this way, "Job_Match" became a binary variable, making it
suitable for binary logistic regression analysis.

5.2 Absence of Outliers

As a consequence of some categories having very low responses, smaller categories were com-
bined into broader ones. The changes were made as follows:

• Gender: The categories "Female" and "Other" were combined into a single category
called "Female." This was done to maintain data balance and improve the model’s per-
formance. As shown in Figure 2 (Appendix F), "Female" and "Other" were aggregated
to avoid losing information from less frequent gender categories. This approach ensures
that the model is not adversely affected by low-frequency categories. While outliers in
categorical data are less of a concern, ensuring balance between categories is important
for model accuracy.

• Age: Ages above 43 were grouped into one category named "43 and above". (See Figure
3, Appendix F)

• Arrival Year: All years before 2018, including 2018, were combined into the category
"2018 and before". The years 2023 and 2024 were grouped into "2023-2024" since 2024
is not yet complete. (See Figure 5, Appendix F)

• Education Level: The categories "Secondary" and "Higher secondary" were combined
into "Higher secondary and below". (See Figure 10, Appendix F)

• Educational Background: Education Backgrounds with fewer than 40 responses (See


Figure 11, Appendix F) were grouped into 2 ways given below:

– Several engineering fields such as "Civil Engineering," "Other Engineering," and


"Electronics and Communication Engineering" were grouped into "Other Engineer-
ing".

22
– Various educational backgrounds including "Natural Sciences," "Finance and Ac-
counting," "Data Science," "Education and Pedagogy," "Media and Communica-
tions," "Human Resources Management," "Architecture and Design," "Agriculture
and Veterinary Science," "Legal Studies," "Mathematics/Statistics," "Tourism and
Hospitality," "Environmental Studies," "Arts and Humanities," and "Social Sci-
ences" were combined into "Other Education background".

• Has Job: The categories "Yes, full-time" and "Pensionist" were combined into "Yes, full-
time" because pensionists receive income similar to full-time employees. (See Figure 12,
Appendix F)

• County Sweden: Counties with fewer than 40 responses (See Figure 24, Appendix F)
were grouped into an "Other county" category. The "Other county" category includes:

– Kronoberg County, Ostergotland County, Dalarna County, Halland County,

– Sodermanland County, Vastmanland County, Gavleborg County, Orebro County,

– Vasternorrland County, Gotland County, Jonkoping County.

5.3 Cross-tabulation of Dependent-Independent variables for Modelling


Variables

5.3.1 Job Match by Gender

Table 1: Cross-tabulation of Job_Match and Gender

Gender
Job Match Total
Female Male

No 99 79 178
Yes 183 197 380

Total 282 276 558

Table 1 shows how job match (whether a participant’s job matches their qualifications)
relates to gender. Out of 558 participants, 380 (68.1%) reported that their job matched their
qualifications. Of those who reported a job match are 48.2% females.

23
Looking more closely at gender, out of 282 female participants, 183 (64.9%) reported a
job match, while 99 (35.1%) did not. Among the 276 male participants, 197 (71.4%) reported
a job match, while 79 (28.6%) reported a mismatch. Females are more likely to report a job
mismatch than males. This suggests that although many participants feel their jobs align with
their qualifications, a significant portion, particularly among females, does not.

5.3.2 Job Match by Age

Table 2: Cross-tabulation of Job_Match and Age

Age
Job Match Total
18-27 28-32 33-37 38-42 43 and above

No 27 70 49 25 7 178
Yes 18 86 160 82 34 380

Total 45 156 209 107 41 558

Table 2 displays the cross-tabulation between the Job_Match variable and Age groups. The
results show how participant’s age relates to whether their job matches their qualifications.
Out of the 558 participants, 380 (68.1%) reported a job match, while 178 (31.9%) reported
a job mismatch. Among the youngest age group, 18-27, 60.0% reported a job mismatch,
while 40.0% reported a job match. For the 28-32 age group, 55.1% reported a job match,
while 44.9% reported a mismatch. The 33-37 age group showed the highest percentage of job
matches, with 76.6% reporting a match and only 23.4% reporting a mismatch. Participants
aged 38-42 had a similar trend, with 76.6% reporting a job match and 23.4% reporting a
mismatch. Finally, in the 43 and above age group, 82.9% reported a job match and only
17.1% reported a mismatch.
This data suggests that job match improves with age, with older participants being more
likely to find jobs that align with their qualifications.

5.3.3 Job Match by Arrival Year

Table 3 presents the cross-tabulation between Job_Match and the year of arrival in Sweden.
This analysis reveals the relationship between the duration of stay in Sweden and job match

24
Table 3: Cross-tabulation of Job_Match and Arrival_Year

Arrival Year
Job Match Total
2018 and before 2019 2020 2021 2022 2023-2024

No 24 6 9 21 39 79 178
Yes 112 36 27 62 72 71 380

Total 136 42 36 83 111 150 558

outcomes. Participants who arrived in 2018 or before showed a high percentage 82.4% of job
matches, with only 17.6% reporting a mismatch. For those arriving in 2019, 85.7% reported a
job match, while 14.3% reported a mismatch. However, participants who arrived in 2020 had
a lower job match percentage 75.0%, and 25.0% reported a mismatch. Those arriving in 2021
also had a higher job mismatch 25.3% compared to those who reported a job match 74.7%.
For those arriving in 2022, 64.9% reported a job match, with 35.1% reporting a mismatch.
Finally, the participants who arrived between 2023-2024 showed 47.3% reporting a job match
and 52.7% reporting a mismatch.
This data indicates that participants who have been in Sweden longer tend to have a higher
likelihood of finding jobs that match their qualifications.

5.3.4 Job Match by County in Sweden

Table 4: Cross-tabulation of Job_Match and County_Sweden

County Sweden
Job Match Total
Other Skane Stockholm Uppsala Vasterbotten Vastra Gotaland

No 51 20 17 22 18 50 178
Yes 63 67 53 32 28 137 380

Total 114 87 70 54 46 187 558

Table 4 shows the cross-tabulation between Job_Match and the county in Sweden where
participants reside. Participants residing in the "Other" counties category showed a job match
percentage of 55.3%, while 44.7% reported a mismatch. In Skane, 77.0% reported a job
match, with 23.0% reporting a mismatch. In Stockholm, 75.7% of participants reported a job

25
match, and 24.3% reported a mismatch. Uppsala participants had 59.3% reporting a job match
and 40.7% a mismatch. For Vasterbotten, 60.9% reported a job match, while 39.1% reported
a mismatch. Lastly, in Vastra Gotaland, 73.3% reported a job match, with 26.7% reporting a
mismatch.
These findings show that for this sample job match likelihood varies by county, with Skane
and Stockholm showing higher alignment with job qualifications compared to other counties.

5.3.5 Job Match by Education Level

Table 5: Cross-tabulation of Job_Match and Education_Level

Education Level
Job Match Total
Bachelors degree Higher secondary and below Masters degree PhD and above

No 71 4 101 2 178
Yes 167 5 176 32 380

Total 238 9 277 34 558

Table 5 provides the cross-tabulation between Job_Match and the education level of par-
ticipants. Among participants with a Bachelor’s degree, 70.2% reported a job match, while
29.8% reported a mismatch. Those with higher secondary education or below had the lowest
job match percentage 55.6% and 44.4% reported a mismatch. Participants with a Master’s
degree had a high job match percentage 63.5%, and 36.5% reported a mismatch. For those
with a PhD or above, 94.1% reported a job match, while only 5.9% reported a mismatch.
The data shows that higher education levels, particularly a PhD, are associated with a higher
likelihood of finding a job that matches one’s qualifications.

5.3.6 Job Match by Educational Background

Table 6 presents the cross-tabulation between Job_Match and participants’ educational back-
grounds. Among participants with a Business, Management, or Marketing background, 45.7%
reported a job match, while 54.3% reported a mismatch. In Electrical Engineering, 77.9%
reported a job match, and 22.1% reported a mismatch. Information Technology participants
had the highest job match percentage 79.8%, with only 20.2% reporting a mismatch. Partic-
ipants with a Mechanical Engineering background reported a 68.2% job match, while 31.8%

26
Table 6: Cross-tabulation of Job_Match and Educational_Background

Educational Background
Job Match Total
Busi- Electrical Information Mechanical Medi- Other Other
ness Engi- Technol- Engineer- cal Edu- Engi-
neering ogy ing cation neering

No 19 17 35 21 20 34 32 178
Yes 16 60 138 45 22 48 51 380

Total 35 77 173 66 42 82 83 558

reported a mismatch. For those in the Medical and Health Sciences field, 52.4% reported a
job match, while 47.6% reported a mismatch. In Other Education fields, 58.5% reported a job
match, with 41.5% reporting a mismatch. Lastly, participants from Other Engineering fields
had a 61.4% job match percentage, with 38.6% reporting a mismatch.
This data shows significant variation in job match across different educational backgrounds,
with Information Technology and Electrical Engineering fields exhibiting the highest job match
percentages.

5.3.7 Job Match by Employment Status

Table 7: Cross-tabulation of Job_Match and Has_Job

Has Job
Job Match Total
Yes, full-time Yes, part-time

No 58 120 178
Yes 356 24 380

Total 414 144 558

Table 7 shows the cross-tabulation between Job_Match and whether participants currently
have a job (either full-time or part-time). Among those who reported having a full-time job,
86.0% reported a job match, while 14.0% reported a mismatch. Participants with part-time
jobs were more likely to report a job mismatch, with 83.3% reporting a mismatch and only

27
16.7% reporting a job match.
These results show that for this sample full-time employment is more strongly associated
with finding a job that matches one’s qualifications, while part-time employment is linked to a
higher rate of job mismatch.

5.4 Binary Logistic Regression and Model Selection

In the following, the factors that are most important in describing job matching are examined
using Lasso logistic regression5 (Lasso in a Generalized Linear Model (GLM), Friedman et al.
(2010)). Here, the response variable is binary, and the independent variables are factors. Lasso
in a GLM is a better choice, as it is designed to handle binary outcomes and factor variables
appropriately.
To determine the size of the penalty function, denoted λ in the Lasso regression, cross-
validation was employed.6 An estimate of λ = 0.0128 was obtained, which indicates that
Lasso regression is not strongly penalizing the coefficients, suggesting a relatively unrestricted
model. However, this can lead to a better fit to the training data; additionally, cross-validation
helps to reduce overfitting.

Explanation of the LASSO Coefficient Paths Figure 1 shows the coefficient paths for var-
ious predictors as the penalty parameter λ changes. In Lasso regression, as λ increases, the
model increasingly penalizes larger coefficients, driving many of them to zero. This results in
the elimination of less important predictors, making the model more interpretable and reducing
the risk of overfitting. The figure visually represents this process, showing how coefficients are
shrunk to zero as λ increases. The coefficients that remain non-zero as λ grows are those that
have the strongest association with the outcome variable (Agresti, 2015).

5.4.1 Results

The coefficients from the final Lasso logistic regression model indicate which factors are most
important in predicting whether someone obtains a job that matches their qualifications. As
5
Alternative approaches that were considered included using information criteria, e.g., the Akaike Information
Criterion (AIC) and Bayesian Information Criterion (BIC), or Ridge regression.
6
Lasso logistic regression effectively shrinks some coefficients to zero, performing automatic variable selec-
tion. This helps identify the most relevant predictors while avoiding overfitting [(Agresti, 2013) and (Agresti,
2015). The R code used for model selection is provided in Appendix B.

28
the levels of the factors have been included as separate binary variables in the Lasso model,
they will be denoted as variables or categories in the discussion below. The variables identi-
fied as important in this model are those with non-zero coefficients, while variables with zero
coefficients are considered unimportant.
From each coefficient, it can be determined how a specific category affects the chances of
obtaining a job match. A positive coefficient indicates that the category increases the chances
of a job match compared to the base-level category, whereas a negative coefficient indicates
that the category decreases the chances of a job match compared to the base level. These
coefficients assist in understanding which categories make it more or less likely for a migrant
to find a job that fits their qualifications.

Base Level Variables Table 12 lists the base-level categories in the model. These base levels
are the reference categories against which other categories of the same variable are compared.
For example, the base level for Gender is Female, so all other gender-related coefficients in
the model are compared to females. Selecting the base level is important in category variable
regression because the analysis is comparing with base-level categories. In this model, the most
frequent category was chosen as the base level, and all other categories were compared to them.
The base-level variables are given below:

• Gender: Female

• Age: 33-37

• Arrival Year: 2023-2024

• County Sweden: Vastra Gotaland County

• Education Level: Master’s degree

• Educational Background: Information Technology

• Has Job: Yes, full-time

Positive Coefficients Table 8 lists the predictors with positive coefficients from the Lasso
logistic regression model. These variables increase the chances of getting a job that matches
your qualifications.

29
Table 8: Positive Coefficients in LASSO

Variable Coefficient Standard Errors (SE)

Intercept 2.0384 0.2350


Gender (Male) 0.1806 0.1968
County Sweden (Stockholm County) 0.2271 0.3326
Education Level (PhD and above) 1.3629 0.5619

• Intercept: The intercept coefficient of 2.0384 represents the starting point, or baseline
log-odds, of finding a job match when all other factors are at their default categories (base
level categories).

• Gender (Male): A positive coefficient of 0.1806 means that being male slightly increases
the likelihood of getting a job that matches your qualifications compared to being female.

• County Sweden (Stockholm County): A positive coefficient of 0.2271 shows that living
in Stockholm County increases the chances of getting a job that matches their qualifica-
tions compared to the County Vastra Gotaland.

• Education Level (PhD and above): The highest positive coefficient, 1.3629, indicates
that having a PhD or a higher degree greatly increases the likelihood of finding a job that
matches your qualifications compared to the Master’s degree.

Negative Coefficients Table 9 lists the predictors with negative coefficients from the Lasso
logistic regression model. These variables decrease the likelihood of a job match.

• Age (18-27): A negative coefficient of -0.01485 indicates that younger individuals (aged
18-27) are less likely to experience a job match.

• Age (28-32): The small negative coefficient of -0.18515 shows a slight decrease in job
match likelihood for this age group.

• Arrival Year (2022): A negative coefficient of -0.04750 suggests that those who arrived
in 2022 are less likely to find a job match.

30
Table 9: Negative Coefficients in LASSO

Variable Coefficient Standard Errors (SE)

Age (18-27) -0.01485 0.2614


Age (28-32) -0.18515 0.2191
Arrival Year (2022) -0.04750 0.1584
County Sweden (Vasterbotten County) -1.02555 0.4218
Educational Background (Business, Management, Marketing) -1.11167 0.5185
Educational Background (Medical and Health Sciences) -0.66698 0.4105
Educational Background (Other Education background) -0.16078 0.3320
Educational Background (Other Engineering) -0.56349 0.2968
Has Job (Yes, part-time) -3.36002 0.2658

• County Sweden (Vasterbotten County): A significant negative coefficient of -1.02555


shows a decreased likelihood of job match for residents of Vasterbotten County.

• Educational Background (Business, Management, Marketing): With a coefficient of


-1.11167, individuals with a background in Business, Management, or Marketing are less
likely to find a job match.

• Educational Background (Medical and Health Sciences): A negative coefficient of


-0.66698 indicates a lower likelihood of job match for those with a Medical or Health
Sciences background.

• Educational Background (Other Education background): A coefficient of -0.16078


shows a decreased likelihood of job match for participants with other education back-
grounds.

• Educational Background (Other Engineering): A coefficient of -0.56349 indicates a


lower likelihood of job match for individuals with other engineering backgrounds.

• Has Job (Yes, part-time): The most substantial negative coefficient, -3.36002, suggests
that part-time employment is strongly associated with a lower likelihood of a job match.

Unimportant Variables Table 13 lists the variables that were shrunk to zero in the Lasso
logistic regression model, indicating that they do not significantly contribute to predicting job

31
match. These variables are Age (38-42, 43 and above), Arrival Year (2018 and before, 2019,
2020, 2021), County Sweden (Skane County, Uppsala County, Other County), Education Level
(Bachelor’s degree, Higher secondary and below), and Educational Background (Electrical
Engineering, Mechanical Engineering). The exclusion of these variables indicates that they do
not play a significant role in predicting whether a participant’s job matches their qualifications.

5.4.2 Model Performance

The performance of the Lasso logistic regression model was evaluated using several key met-
rics: accuracy, precision, recall, and F1-score. These metrics help in understanding how well
the model is able to correctly predict job matches and how it balances between false positives
and false negatives.

Table 10: Confusion Matrix: Cross-tabulation of Predicted vs. Actual Job Match

Actual Job Match


Predicted Job Match Total
No Yes

No 120 23 143
Yes 58 357 415

Total 178 380 558

Confusion Matrix The confusion matrix (Table 10) shows the cross-tabulation of predicted
versus actual job matches. It is a crucial tool for visualizing the performance of a classification
model. True Positives (TP); The model correctly predicted 357 actual job matches as matches.
True Negatives (TN); The model correctly identified 120 non-matches as non-matches. False
Positives (FP); The model incorrectly predicted 58 non-matches as matches. False Negatives
(FN); The model incorrectly identified 23 actual matches as non-matches.

Performance Metrics The performance metrics, as calculated in Appendix C, show how


well the model performs, with the results summarized in Table 11. Accuracy measures how
often the model’s predictions were correct. In this case, the model correctly predicted 85.48%
of the job matches. Precision indicates how many of the job matches predicted by the model
were actually correct. A precision of 86.02% means that most of the job matches predicted by

32
Table 11: Performance Metrics for Lasso Logistic Regression Model

Metric Value
Accuracy 0.8548
Precision 0.8602
Recall (Sensitivity) 0.9395
F1 Score 0.8981

the model were indeed correct. Recall (Sensitivity or True Positive Rate) shows how many
of the actual job matches were correctly identified by the model. A recall of 93.95% indicates
that the model correctly identified a large proportion of the actual job matches. F1 Score is the
harmonic mean of precision and recall, providing a balance between the two. The F1 score of
89.81% reflects that the model is both accurate and reliable in predicting job matches.

6 Discussion

6.1 Results

The starting point of this thesis was the experience of moving from Kerala, a state in India, to
Sweden and the numerous questions regarding migration from Kerala to Sweden. The problem
in answering many of these questions stemmed from a lack of data, as no data exists on people
from Kerala in Sweden. To address this issue, a survey was designed, and data was collected
from individuals of Kerala origin, enabling the answers to some of these questions.
The results from the survey indicate that immigration from Kerala to Sweden began be-
fore 2000 and includes individuals of all ages (including pensioners), living in various types of
housing, with apartments being the most common. Furthermore, the survey reveals that Vas-
tra Gotaland County has the highest population of Malayalis in Sweden, and there are more
females than males within the Malayali community. Most respondents expressed satisfaction
with their lives in Sweden, although not all did. Issues were raised regarding low-paying jobs
and dissatisfaction with personal, family, and overall life after moving to Sweden. All districts
in Kerala are represented in Sweden, although responses were not received from some counties.
Most counties in Sweden have Malayali representation. Interestingly, a few individuals did not
encounter any challenges during their first year. In terms of housing, while apartments are the

33
most common choice, houses are also quite popular.
The Lasso logistic regression model proved effective in predicting job-qualification align-
ment with notable accuracy and precision. From this data analysis, it was found that higher
educational qualifications significantly improve the chances of finding a job that matches one’s
skills. Although Sweden is recognized as one of the best countries for gender equality, the
survey indicates that males still have a higher chance of securing a qualified job than females.
Stockholm County, the capital city of Sweden, appears to be the most promising location for
finding a qualified job.

6.2 Implications

Migrants coming to Sweden may potentially improve their job prospects by getting higher
education, such as Master’s or PhD degrees. The timing of when you move is likely to matter;
for example, those who arrived in 2022 seem to have jobs that do not fit their skills than those
who moved more recently. It might be helpful to think about when you move, as changes in
the job market and migration rules can affect job chances. For policymakers, it’s important to
offer support that fits the needs of different groups of migrants based on when and where they
arrived. Also, some migrants are facing lower than regulated salaries so it’s important to make
sure workers know their rights. Many migrants are also unhappy with their personal and family
life after moving, so policies that help with work-life balance and overall well-being would
be useful. By improving education opportunities and providing targeted support, along with
addressing job quality and satisfaction, both migrants and policymakers can work together to
achieve better job matches and a better quality of life in Sweden.

6.3 Limitations and Challenges

The main issue with the study was the lack of a sampling frame for people from Kerala, which
made it impossible to use a probability sampling method. As a result, a combination of non-
probability sampling methods, such as judgmental sampling and snowball sampling, was used.
Both judgmental sampling and snowball sampling suffer from selection bias, limited general-
izability, and difficulty in reaching diverse populations (Atkinson and Flint (2001)). To reduce
selection bias, efforts were made to include all counties in Sweden and reach different age
groups and migrants from various years through judgmental sampling. However, the sample

34
may not represent all Kerala migrants in Sweden, so caution should be exercised when gen-
eralizing the results to the whole population. A further concern is that the data only captures
a snapshot at a single point in time, which limits the understanding of long-term integration
experiences.
Future research should aim to use a larger, more varied sample and include methods that
track changes over time to provide a clearer picture of migrant experiences. Since detailed
information about the overall population is lacking, a proper sampling frame could not be
established. The implication is that the results from the analysis may be difficult to generalize.
Furthermore, some problems with missing information were encountered.
Future research should also consider comparing the experiences of migrants from other
regions to identify broader patterns and challenges. Longitudinal studies that track changes
over time will provide a deeper understanding of how job alignment and integration evolve.
Given the current limitations in population data, future studies should aim to include a larger
and more diverse sample to better understand the experiences of Kerala migrants in Sweden.

35
References
Agresti, A. (2013). Categorical Data Analysis. John Wiley & Sons, 3rd edition.

Agresti, A. (2015). Foundations of Linear and Generalized Linear Models. John Wiley &
Sons.

Atkinson, R. and Flint, J. (2001). Accessing hidden and hard-to-reach populations: Snowball
research strategies. Social Research Update, 33:1–4.

Bratsberg, B., Ragan, J. F., and Nasir, Z. M. (2002). Foreign-born workers in the us labor
market. Journal of Economic Literature, 40(1):105–138.

Cassarino, J.-P. (2004). Theorising return migration: The conceptual approach to return mi-
grants revisited. International Journal on Multicultural Societies, 6(2):253–279.

Chiswick, B. R. and Miller, P. W. (2003). The skills of immigrants in the us: Education and
gender. Research in Labor Economics, 22:229–255.

Evans, J. R. and Mathur, A. (2009). Online surveys and migrant populations. Journal of
Migrant Studies, 15(2):123–145.

Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization paths for generalized linear
models via coordinate descent. Journal of Statistical Software, 33(1):1–22.

Goodman, L. A. (1961). Snowball sampling. The Annals of Mathematical Statistics, 32(1):148–


170.

Groves, R. M., Fowler Jr, F. J., Couper, M. P., Lepkowski, J. M., Singer, E., and Tourangeau,
R. (2011). Survey Methodology. Wiley.

Larsson, C. (2024). Indian high-skilled labor migrants in sweden: A study about social inte-
gration, interpersonal communication, and national identification.

Lee, E. S. (1966). A theory of migration. Demography, 3(1):47–57.

Lundborg, P. and Skedinger, P. (2013). Ethnic enclaves and the economic success of immi-
grantsâevidence from sweden. The Scandinavian Journal of Economics, 115(3):905–929.

Martin, P. (2009). Recession, Migrants, and the Welfare State. University of California Press.

36
Perla, R. J. and Provost, L. P. (2012). Judgment sampling: A health care improvement perspec-
tive. Quality Management in Health Care, 21(3):169–175.

Piracha, M. and Vadean, F. (2013). Immigrant overeducation: A literature review and gaps.
Journal of Economic Surveys, 27(4):963–988.

Zhang, Z. (2016). Multiple imputation with multivariate imputation by chained equation (mice)
package. Annals of Translational Medicine, 4(2):30.

37
Appendix

A Base Level and Unimportant Variables in LASSO

Table 12: Base Level Variables

Variable (Base Level)

Gender (Female)
Age (33-37)
County Sweden (Vastra Gotaland County)
Education Level (Masters degree)
Educational Background (Information Technology)
Has Job (Yes, full-time)
Arrival Year (2023-2024)

Table 13: Unimportant Variables in LASSO

Variable

Age (38-42)
Age (43 and above)
Arrival Year (2018 and before)
Arrival Year (2019)
Arrival Year (2020)
Arrival Year (2021)
County Sweden (Skane County)
County Sweden (Uppsala County)
County Sweden (Other County)
Education Level (Bachelors degree)
Education Level (Higher secondary and below)
Educational Background (Electrical Engineering)
Educational Background (Mechanical Engineering)

38
B Model Implementation
R Code The following R code was used to implement the Lasso logistic regression model:

# P e r f o r m LASSO l o g i s t i c r e g r e s s i o n w i t h c r o s s − v a l i d a t i o n
l a s s o _ model <− cv . g l m n e t ( x , y , a l p h a = 1 , f a m i l y = " b i n o m i a l " )
b e s t _ lambda <− l a s s o _ model $ lambda . min

# F i t t h e f i n a l model u s i n g t h e b e s t lambda v a l u e
f i n a l _ model <− g l m n e t ( x , y , a l p h a = 1 , f a m i l y = " b i n o m i a l " ,
lambda = b e s t _ lambda )

In this code, [Link] is used to perform cross-validation and to find the optimal
lambda value ([Link]) that minimizes the cross-validation error. The final model is
then fitted using this best lambda value, that is 0.0128.

C Performance Metrics Calculation

TP + TN 357 + 120
Accuracy = = = 0.8548 (85.48%)
TP + TN + FP + FN 558

TP 357
Precision = = = 0.8602 (86.02%)
TP + FP 357 + 58

TP 357
Recall (Sensitivity or True Positive Rate) = = = 0.9395 (93.95%)
TP + FN 357 + 23

Precision × Recall 0.8602 × 0.9395


F1 Score = 2 × =2× = 0.8981 (89.81%)
Precision + Recall 0.8602 + 0.9395

39
D Multicollinearity
Lasso binary logistic regression was selected as the primary modelling technique for predicting
job matches. This method is particularly advantageous when dealing with datasets that have
multicollinearity among predictors and category-level feature selection.
Table 14 shows the Variance Inflation Factors (VIF) for the independent variables. High
GVIF values (typically > 10) or high GVIF1/(2·Df) values (typically > 2.5) suggest multi-
collinearity issues. Since all factors are below these thresholds, there seem to be very limited
reasons to be concerned with multicollinearity.

Table 14: Variance Inflation Factors (VIF) for Independent Variables

Variable GVIF DF GVIF1/(2·Df)

Gender 1.37 1.00 1.17


Age 1.68 4.00 1.07
Arrival_Year 1.76 5.00 1.06
County_Sweden 1.71 5.00 1.06
Education_Level 1.64 3.00 1.09
Educational_Background 2.09 6.00 1.06
Has_Job 1.39 1.00 1.18

40
E LASSO Coefficient Paths for Various Predictors

LASSO Coefficient Paths


25 25 22 19 9 3 1 0

1: GenderMale
2: Age18−27
18
3: Age28−32
4: Age38−42
5: Age43 and above
6: Arrival_Year2018 and before
7: Arrival_Year2019
8: Arrival_Year2020
2

9: Arrival_Year2021
10: Arrival_Year2022
11: County_SwedenOther county
12: County_Swedenskåne county
13: County_Swedenstockholm county
14: County_Swedenuppsala county
15: County_Swedenvästerbotten county
16: Education_LevelBachelor’s degree
17: Education_LevelHigher secondary and below
18: Education_LevelPhD and above
1

19: Educational_BackgroundBusiness, Management, Marketing


20: Educational_BackgroundElectrical Engineering
21: Educational_BackgroundMechanical Engineering
22: Educational_BackgroundMedical and Health Sciences
1
13
23: Educational_BackgroundOther Education background
8
24: Educational_BackgroundOther Engineering
7 25: Has_JobYes, part−time
9

14
0

12
10
2
5
11
4
Coefficients

16
17
3

20

21
−1

23

24
15

22
−2

19
−3

25
−4

−8 −7 −6 −5 −4 −3 −2 −1

Log(Lambda)

Figure 1: LASSO Coefficient Paths for Various Predictors.

The plot shows how the coefficients of different predictors evolve as the regularization param-
eter, Lambda, changes.

41
F Survey Questions and Corresponding Bar Plots

Figure 3: 2. Age
Bar Plot of Age

255
(35.6%)

Figure 2: 1. Gender 223


(31.1%)

Bar Plot of Gender 200

418
(58.4%)

400

Count
120
292
(16.8%)
(40.8%)
300

100
Count

67
200 (9.4%)

38
(5.3%)

100

6 4
1 1 1
(0.8%) (0.6%)
(0.1%) (0.1%) (0.1%)
1 5
(0.7%)
0
(0.1%)
0
2

A
−2

−2

−3

−3

−4

−4

−5

−5

−6

N
er

A
al

al

18

23

28

33

38

43

48

53

63
N
th

m
O

Fe

Gender Age

Figure 5: 4. When did you arrive in Sweden?


Bar Plot of Arrival_Year

Figure 4: 3. Why did you move from 200


190
(26.5%)

Kerala to Sweden? 156


(21.8%)

150
Bar Plot of Move_Reason

332
(46.4%)
Count

300
95
100 (13.3%)
238
233
(33.2%)
(32.5%)
210
(29.3%)

200
Count

53
151
(21.1%) (7.4%)
43 42
50 35 (6%) (5.9%)
33
(4.9%) 27 (4.6%)
100 (3.8%)
19
62 15
(8.7%) (2.7%)
(2.1%)
3 2 3
8 (0.4%) (0.3%) (0.4%)
(1.1%)
0
0
b

th

sa

16

17

18

19

20

21

22

23

24

A
p
Jo

io

io

00

01

01
w

vi

N
ro
t

un

20

20

20

20

20

20

20

20

20

20
ro

ca

Eu

en

−2

−2

−2
rg

re
du

e
ng
in

ily
ee

or

01

06

11
re

he
tle

m
ar

he

ef

20

20

20
Sc
Fa
C

t
se
ig

B
H

To

Move_Reason Arrival_Year

42
Figure 7: 6. What is your current visa status?
Bar Plot of Visa_Status

203
200
(28.4%)
(27.9%)
Figure 6: 5. Did you move to Sweden 200

alone or with family?


150
Bar Plot of Move_With_Family
122
600 (17%)
543

Count
(75.8%)

100 86
(12%)
76
(10.6%)
400
Count

50

24
(3.4%)
172
200
(24%) 5
(0.7%)

ce

po p /

sa

A
en

en
e

N
ek

rt
en

vi
ss hi
1

ud

d
se

pa ns

en
b
id
(0.1%)

St

Jo
s

h e

ep
b

re

is itiz
0
Jo

D
nt

ed C
e
an
ne

ily

rm
N
m
lo

Sw
fa
A

Pe
ith
W

Move_With_Family Visa_Status

Figure 8: 7. How did you apply for your visa the


first time?
Bar Plot of Visa_Application Figure 9: 8. Did you have an interview
300 277
(38.7%) for the first-time visa process?
Bar Plot of Visa_Interview
223
(31.1%)
209
(29.2%) 503
(70.3%)
200
Count

400

100
Count

205
(28.6%)
200

7
(1%)

0
y

be ily

ny

8
nc

N
r
em m

pa
ge

(1.1%)
M Fa

om
A

C
lf

0
se
ur
Yo

A
Ye

Visa_Application Visa_Interview

43
Count
En

0
100
200
vi
ro
n
St me
ud nta

2
ie l
To
u s
H ris
os m
So pita an

4
ci lit d
al y
Sc
H ie
um nc
es

4
an
M Re
an s
ag ou
A em rc Count
rc

4
hi en es

0
100
200
300
400

te t
ct
ur
D a e
Ve A es n

4
te gr ig d
rin ic n Se
ar ult
y ur c on
Sc e da
ie an

2
C n ry

4
om

(0.3%)
ce d
M m M

(0.3%) (0.6%) (0.6%) (0.6%) (0.6%) (0.6%)


at un ed
he ic ia
m at an
io d

5
at
ic

Bar Plot of Educational_Background


ns
s/
H

(0.7%)
St
ig
Bar Plot of Education_Level

at he
is
tic rs
s

5
Le ec
ga on

(0.7%)
da
A lS

8
rt t ry
s ud

(1.1%)
an ie
s

5
d
H

(0.7%)
um
an
iti
es

7
Ed
uc Ph
a D
Pe tio an
da n a d
go nd ab

8
D gy ov

44
at e
36

a
(5%)

Sc
ie
nc
e

9
Fi

Education_Level

(1%) (1.1%) (1.3%)


A nan
cc c
ou e B
E nt n a ac
C lec d he
om tr ing lo
o r’s

(2.8%)
E m ni de
O ng uni cs gr
th in ca a ee

Educational_Background
er e ti nd
301
(42%)

Eneri on

20 (3.6%)
gi ng
ne
N er
in
at
g

34
ur
al
Sc
ie M
C nc as
iv es te
il r’s

35
En de
gi gr
ne ee
369

er
in
(51.5%)

36
26 (4.7%) (4.9%) (5%)
M B
an u
M M ag sin
ed a em e
Figure 11: 10. What is your educational background?

ic rk e ss
Figure 10: 9. What is your highest level of education?

48
al et nt ,
an ing ,

(6.7%)
Sc d H
ie ea
nc lth

59
es

(8.2%)
En Mec
gi ha
ne n
er ica
in l

71
g

(9.9%)
En E
gi ec l
ne tr
er ica
in l

94
In g
Te fo

(13.1%)
ch rm
no ati
lo on
gy
232
(32.4%)
Figure 13: 12. Does your current job match
Figure 12: 11. Do you have a job? your qualifications?
Bar Plot of Has_Job Bar Plot of Job_Match

413 380
400
(57.7%) (53.1%)

400

300

300

Count
Count

200
200 156
(21.8%)
144
(20.1%)
106
108 (14.8%)
(15.1%)
100
67
100
(9.4%)
48
(6.7%)

5 2
1 2 (0.7%) (0.3%)
(0.1%) (0.3%)
0
0

lif am

lif am

le

A
ch

N
ab
t

in b

ch
at
is

en

tim

tim
ch jo

ie

ie
ua , I

ua , I

ic
g

m
on

at
ud

rq No

rq No
ar m

pl
t−

l−

tm
't
si

se I a
st

ap
sn
ul
ar
n

,i
,f
Pe

,p
o,

oe

ot

s
de

e
s
m

Ye
N
ov

d
Ye
Ia

Ye

un

it
o,

o,
N

N
Has_Job Job_Match

Figure 14: 13. In your job (part-time or full-


time), are you receiving a salary or amount as Figure 15: 14. How satisfied are you with the
specified by Swedish regulations? job you have now?
Bar Plot of Salary_Compliance Bar Plot of Job_Satisfaction

500
458 201
199
(64%) (28.1%)
(27.8%)
200

400
156
(21.8%)

150

300
Count

109
Count

(15.2%)

100
200
156
(21.8%)

100 50 38
52 (5.3%)
38 (7.3%)
(5.3%)
11
1 11
(1.5%)
(0.1%) (1.5%) 2
0 (0.3%)
0
re

le

A
ns t
tio u

Ye
N

N
ab
su
la bo

ic
gu a

ot

pl

le

A
re ow

ra
N

fie

fie

fie

fie

N
ap

ab
t
h n

eu
is

is

tis

tis
ic
is 't k

ot

at

pl
sa

Sa

sa
N

ss
ed on

ap
is

ry
di
Sw I d

ot

Ve
y

N
r
Ve

Salary_Compliance Job_Satisfaction

45
Figure 16: 15. What do you think are the main problems in getting a job in Sweden?

Bar Plot of Job_Problems

568
600
(79.3%)

400
Count

276
(38.5%)

190 184 181 176


(26.5%) (25.7%)
200 (25.3%) (24.6%)

124
(17.3%)
94
(13.1%)

13 16
(1.8%) (2.2%)

0
r

or al

su sa

nc nt

te h

tio ign n

A
ie

er
ce
in jo

da it

re io

N
tw n

rie va
k

fie my
es

s
is r vi
rr

di w
ne sio

th
ifi fo it

en
es ed
ba

pe ele

al of gn
ld

ns
an ion

O
o

er
es

iti it

o
ex f r
ge

it

un Lim

iff
l c tit

c
of

rm

re

a
ua

ld
ca pe
pr

c
ck
pe

of
ng

ra
lo om
of

La

tu
k

ck
La

rt

qu
C
or
ck

ul
po

La
W
La

C
op

Job_Problems

Figure 17: 16. Are you satisfied with medical care in Sweden?
Bar Plot of Medical_Care_Satisfaction

300 279
(39%)

200 177
(24.7%)
Count

143
(20%)

100

51
44 (7.1%)
(6.1%)
22
(3.1%)

0
d

al

A
fie

fie

fie

fie

N
tr
eu
tis

is

tis

tis
at

N
sa

sa

Sa
ss

is
y

di

D
r
Ve

yr
Ve

Medical_Care_Satisfaction

46
Figure 18: 17. Have you experienced any health
issues after moving to Sweden?
Figure 19: 18. How much do you care
Bar Plot of Health_Issues

310
about your mental and physical health
(43.3%)

300
268
(37.4%)
after moving to Sweden?
Bar Plot of Health_Care

247
(34.5%)
200
Count

197
(27.5%)
200

115 172
(24%)
(16.1%)

81
100
(11.3%)

Count
42
(5.9%)
100 81
19
15 (11.3%)
(2.7%)
(2.1%) 3
(0.4%)
0

17
es

Pa Pa th

es

A
(2.4%)
er
bl

ie
io

N
l
in rt
ci

su
or a
rg
a

th
ss

2
y He
en

ic

is
lle

O
re
pl

(0.3%)
i

od n
ic

e
A
ep
ap

/B mo

ar
ef

0
lc
D

ot

es m

ta
N
in

su Co

en
m
ta

A
al

ha

lo
Vi

uc

uc

N
at
Is

ew

a
m

te
ot

m
ot

ui
r
N

So
N

Ve

Q
Health_Issues Health_Care

Figure 20: 19. How satisfied are you with Figure 21: 20. How satisfied are you with the
Swedish food culture compared to Kerala amount of personal or family time you have
food culture? after moving to Sweden?
Bar Plot of Food_Culture_Satisfaction Bar Plot of Family_Time_Satisfaction

327 312
(45.7%) (43.6%)

300
300

248
(34.6%)

200
200
169
Count

Count

(23.6%)

104
104
(14.5%)
(14.5%)
100
100
60
47 (8.4%)
(6.6%)
27
15 (3.8%)
9 8
(2.1%) 2
(1.3%) (1.1%)
(0.3%)
0 0
d

le

A
ra

ra
fie

fie

fie

fie

fie

fie

fie

fie
N

N
ab
t

t
eu

eu
is

tis

is

tis

is

is

tis

tis
ic
at

at

t
N

N
pl
sa

sa

Sa

sa

Sa

sa
ss

ss
ap
is

is
ry

ry
di

di
D

D
ot
Ve

Ve
y

y
N
r

r
Ve

Ve

Food_Culture_Satisfaction Family_Time_Satisfaction

47
Figure 22: 21. How satisfied are you with your life after
moving to Sweden?
Bar Plot of Life_Satisfaction

336
(46.9%)

300

190
200 (26.5%)
Count

142
(19.8%)

100

37
(5.2%)

7 4
(1%) (0.6%)
0
d

al

A
fie

fie

fie

fie

N
tr
eu
is

tis

tis

tis
at

N
sa

sa

Sa
ss

is

y
di

r
Ve
yr
Ve

Life_Satisfaction

Figure 23: 22. Which district are you from in Kerala?

Bar Plot of District_Kerala

147
(20.5%)
150

100
87
(12.2%)
Count

72
70
(10.1%)
(9.8%)

55
51 (7.7%)
45 (7.1%)
42
50 (6.3%)
37 (5.9%)
35
(5.2%)
29 (4.9%)
(4.1%)

16
13
10 (2.2%)
7 (1.8%)
(1.4%)
(1%)

0
ad

od

ki

tta

ad

m
r

am

A
nu

su
od

N
uk

ra

la

ya
an

kk

uz
ag

ur

ul
th

ris
an
l
ik
pu

ta
Id

Ko
la

pp
ay

k
ar

ap
zh

Th
t

na
ap

Pa

Ko
as

na
W

la

th
Ko

Er
al

A
K

ha

an
M

n
Pa

va
iru
Th

District_Kerala

48
Figure 24: 23. Which county are you currently residing in Sweden?

Bar Plot of County_Sweden

243
(33.9%)

200
Count

109
98 (15.2%)
(13.7%)
100
74
(10.3%)

49
(6.8%)
36
26 (5%)
23
14 15 (3.2%) (3.6%)
9 (2.1%) 6
4 5 (2%)
1 2 2 (1.3%)
(0.6%) (0.7%) (0.8%)
(0.1%) (0.3%) (0.3%)
0
y

bo un nd

ty

ty

ty

nt d

A
nt

nt

nt

nt

nt

nt

nt

nt

nt

nt

nt

ou n

N
un

un

un
rg ty

y
C rla

C ala
ou

ou

ou

ou

ou

ou

ou

ou

ou

ou

ou
o

o
or

öt
C

C
o
rn

G
o

nd

na

rg

nd

nd

nd

la

e
ån
br

an

in

tte
te

sa

ol

ra
be
ar
la

tla

la

la

öp
re

Sk
l

kh
bo

st
pp
an

an

ot

al
al

no

le


Ö

nk


H

oc
D

G
äv

er

U
m

rm

ro

er

st

St
st
G

K
de

st


Ö

County_Sweden

Figure 25: 24. How would you rate cultural integra-


tion with Swedish society?
Bar Plot of Cultural_Integration

400
363
(50.7%)

300

209
(29.2%)
Count

200

82
100
(11.5%)

39
(5.4%)
17
6 (2.4%)
(0.8%)

0
sy

sy

rd

A
ar

N
iu
ha
ea

Ea

ed
y
y

M
r
r

Ve
Ve

Cultural_Integration

49
Figure 26: 25. Have you faced any
Figure 27: 26. What type of accommodation do you
challenges after moving to Sweden? If
have?
yes, how often have you faced these
Bar Plot of Accommodation_Type

challenges within the first year you 500 464


(64.8%)

came?
400

Bar Plot of Challenges_Faced

319
(44.6%)
300
300

Count
200

200
108
Count

159
(22.2%) (15.1%)
131 100
(18.3%)
39 40
30
19 (5.4%) (5.6%)
15 (4.2%)
100 (2.7%)
(2.1%) 1
(0.1%)
51
45
(7.1%) 0
(6.3%)

11
tm o g

en g

om g

en g

se

A
en
ar ro in

tm in

ro in

tm in

in

N
en m

ou
(1.5%)
ap ore ous

ar us

or us

ar us

liv

m
t

rt
ap ho

rid ho

ap ho

ed

pa
m th

ar
m nt

or t

io nt
0

A
or n

C den

Sh
o ude

oo de

ud de
−r tu

St tu
Tw St

St
ne S

S
r

es

A
e

el
te

fte

N
ev

im
ar
of

O
N

et
R
y

m
r

O
Ve

So

Challenges_Faced Accommodation_Type

Figure 28: 27. What is your rent range per month?


Figure 29: 28. Are you a parent? If
Bar Plot of Rent_Range

150
135 135
yes, how satisfied are you with parental
(18.9%) (18.9%)

122
(17%) benefits in Sweden?
110
(15.4%)

Bar Plot of Parental_Status

100
251
(35.1%)

71
Count

69
(9.9%)
(9.6%)
188
200 184
(26.3%)
(25.7%)

50
Count

28
25
(3.9%)
(3.5%)
100
13
8 (1.8%) 65
(9.1%)
(1.1%)

0 19
(2.7%)
7
2
(1%)
(0.3%)
K

A
en
SE

SE

SE

SE

SE

SE

SE

SE

N
R

0
00

00

00

00

00

00

0
ng

00

50
,0

,0

,5

,0

,5
yi
5,

0,

7,
20

20

17

15

12
Pa


an

al

le

A
0−

0−

0−

0−

0−

00
an

fie

fie

fie

fie

N
ab
tr
ot
th

eu
50

00

50

00

50

tis

is

tis

tis

lic
th

5,

N
a

sa

Sa

sa
ss

p
,

7,

ss
17

15

12

10

ap
e

is

y
or

Le

di

ot
Ve
y
M

N
r
Ve

Rent_Range Parental_Status

50
Figure 30: 29. How much knowledge did you have about Sweden before moving?
Bar Plot of Sweden_Knowledge

328
(45.8%)

300

239
(33.4%)

200
Count

106
(14.8%)

100

40
(5.6%)

3
(0.4%)
0
e

al t

bl y

bl y

A
a

ea htl

ea tel
bl

N
l

e
e
ea

dg ra
bl

dg ig
dg

ea

le Sl

le de
dg
le

ow Mo
ow

le

ow
ow
kn

kn

kn
kn
yr
Ve

ot
N

Sweden_Knowledge

G Missing Data Pattern and Imputation


Educational_Background
Education_Level

County_Sweden
Arrival_Year
Job_Match
Has_Job

Gender
Age

545 0

4 1

4 1

2 1

2 1

1 3

0 0 0 0 3 3 4 5 15

Figure 31: Missing Model Data Pattern

51
Table 15: Table with Highlighted Imputed Values in the Model Data

Education Educational Has Job Job Age Arrival County Gender


Level Back- Match Year Sweden
ground

XXX XXX XXX XXX XXX XXX XXX Female


XXX XXX XXX XXX XXX XXX XXX Female
XXX XXX XXX XXX XXX XXX XXX Female
XXX XXX XXX XXX XXX XXX XXX Male

XXX XXX XXX XXX XXX XXX Uppsala XXX


XXX XXX XXX XXX XXX XXX Vaster- XXX
botten
XXX XXX XXX XXX XXX XXX Skane XXX
XXX XXX XXX XXX XXX XXX Skane XXX

XXX XXX XXX XXX XXX 2021 XXX XXX


XXX XXX XXX XXX XXX 2018 XXX XXX

XXX XXX XXX XXX 33-37 XXX XXX XXX


XXX XXX XXX XXX 43-47 XXX XXX XXX

XXX XXX XXX XXX 33-37 2022 XXX Male

52
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
3
1
1
1
3
3
1
5
1
2
3
1
1
1
1
1
1
1
3
1
1
3
8
1
1
1
9
1
1
1
4
1
1
2
13
3
14
1
10
25
559

0
Education_Level

0
Educational_Background

1
Move_With_Family

1
Salary_Compliance

1
Accommodation_Type

2
Has_Job

2
Job_Match

2
Job_Satisfaction

2
Health_Care

2
Family_Time_Satisfaction

3
Arrival_Year

3
Sweden_Knowledge

4
Age

4
Life_Satisfaction

5
Gender

53
5
Visa_Status

6
County_Sweden

7
Visa_Application

8
Move_Reason

8
Visa_Interview

9
Food_Culture_Satisfaction
Challenges_Faced
Figure 32: Missing Data Pattern

Job_Problems
District_Kerala
Cultural_Integration
Parental_Status
Medical_Care_Satisfaction
Rent_Range
Health_Issues

3
4
4
6
4
6
4
5
4
2
3
5
1
3
1
2
1
4
1
1
4
4
2
1
1
2
1
3
2
1
2
2
2
2
3
2
2
1
2
3
2
1
2
3
2
1
3
2
2
1
2
2
2
1
2
1
2
1
1
0

11 16 16 17 19 22 25 42 243

You might also like