Proposal revised down to data accuisition
This commit is contained in:
@@ -2,8 +2,9 @@
|
||||
\usepackage{latex8}
|
||||
|
||||
\usepackage{titlesec}
|
||||
\usepackage[margin=0.5in]{geometry}
|
||||
% \usepackage[margin=0.5in]{geometry}
|
||||
\usepackage{graphicx}
|
||||
\usepackage{amsmath}
|
||||
|
||||
\titleformat{\section}{\large\bfseries}{\thesection}{1em}{}
|
||||
|
||||
@@ -24,9 +25,9 @@ given the exponential number of dishes which can be created from a
|
||||
small number of ingredients, as well has hard constraints such as
|
||||
allergies and religious beliefs. Many professional catering services
|
||||
handle this problem by allowing guests to select from a very limited
|
||||
menu. We propose to develop a dish recommendation system
|
||||
based on Bayesian Networks modeling user preferences and
|
||||
which proposes meals that most likely match the varied tastes
|
||||
menu. We introduce a dish recommendation system
|
||||
based on Bayesian Networks modeling user preferences.
|
||||
We predict the meals from a data base of recepices that most likely match the varied tastes
|
||||
of the customers, using a limited set of ingredients. This type of expert system
|
||||
would be of great use to a catering service or restaurant which needs to rapdily decide on
|
||||
a small number of dishes which would be acceptable for a large dinner party,
|
||||
@@ -41,26 +42,44 @@ and past food choices \cite{janzenxiang}. Baysian networks have also been
|
||||
applied to recommendation systems before in on-line social
|
||||
networks \cite{truyen} making predictions of the form
|
||||
``if you bought those items what is the probability you would like to
|
||||
buy that''. We suggest that these approaches are limited in that they only consider the preferences of a single (or supposed 'typical') user rather than a group.
|
||||
buy that''. We suggest that these approaches are limited in that they
|
||||
only consider the preferences of a single (or supposed 'typical') user rather than a group.
|
||||
|
||||
\section*{Proposed Approach}
|
||||
\section*{Approach}
|
||||
|
||||
The approached problem is to pick a single meal which best meets the requirements
|
||||
and tastes of different people dining together.
|
||||
and tastes of different people dining together. We learn a predictive
|
||||
baysian net from a survey distributed to participants of the meal as
|
||||
training data in order to capture their preferences. The dishes
|
||||
in the questionaire are selected such that all ingrediants
|
||||
are covered. The participants rate each dish on a scale from
|
||||
one to ten and give additional information like vegetarians.
|
||||
For new dishes we then predict the maximum likelihood
|
||||
rating given our model.
|
||||
In the following we will describe our approach in detail.
|
||||
First we will discuss the data selection, then the
|
||||
modeling of the user preference and in the
|
||||
end how to train the modeled net from
|
||||
gathered data and howe to predict the
|
||||
value for a new recepice.
|
||||
|
||||
%\subsection*{Application Framework}
|
||||
First, we will accumulate a diverse collection of sample recipes using the open source AnyMeal application
|
||||
to convert freely available MealMaster format (flat file) recipes to XML format for input into the Java Bayesian network / optimization
|
||||
application we propose.
|
||||
\paragraph*{Data accuisition}
|
||||
We accumulated a diverse collection of sample recipes using the open source AnyMeal application.
|
||||
We converted to the freely available MealMaster format (flat file)
|
||||
recipes to XML format for input into our application.
|
||||
We will gathered data representing several diners' preference for
|
||||
approximately 20 meals using a simple survey of the type 'rate on a
|
||||
scale of 1 to 10, 10 being favorite and 1 being least favorite'.
|
||||
Furthermore we collected data for vegetarians and vegans.
|
||||
|
||||
%\subsection*{Data Collection}
|
||||
Next, we will gather data representing several diners' preference for approximately 20 meals using a simple survey of the type 'rate on a scale of 1 to 10, 10 being favorite and 1 being least favorite'. A value of 0 for a given dish will be taken to mean that one or more ingredients trigger and allergy or violate a religous constraint, and the diner cannot consume the dish.
|
||||
|
||||
%\subsection*{Model}
|
||||
%daniel is here
|
||||
\paragraph*{Knowledge Engineering}
|
||||
We will model each individual user's preferences and needs
|
||||
as a Bayesian network, which means a set of independence and
|
||||
conditional independence relationships between variables
|
||||
\cite{russelnorvig}. Our model consists of 4 layers,
|
||||
\cite{russelnorvig}.
|
||||
Our model consists of 4 layers,
|
||||
each modeling a different aspect of taste and needs.
|
||||
In the first layer we capture general meal preferences, like
|
||||
being vegetarian or not liking your food steamed.
|
||||
@@ -79,26 +98,48 @@ someone suffers from diabetes.
|
||||
The overall net is shown in Figure \ref{img:bayes_net}.
|
||||
Given a recipe with a list of ingredients $I = i_1,...,i_n$
|
||||
and a Bayesian network capturing user preferences
|
||||
we can calculate the probability of users liking the dish as
|
||||
$P(i_1 \wedge i_2 \wedge ... \wedge i_n) = \Pi_{i =
|
||||
1}^{n} p(i_i \mid parents(i_i))$ \cite{russelnorvig}.
|
||||
we can calculate the probability of users liking the dish given
|
||||
the probabilities of liking each ingrediant.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=\linewidth]{bayes.jpeg}
|
||||
\includegraphics[width=\linewidth]{bayes}
|
||||
\caption{Our Baysian net modeling user preferences}
|
||||
\label{img:bayes_net}
|
||||
\end{figure}
|
||||
|
||||
%\subsection*{implementation}
|
||||
\paragraph*{Learning and Predicting}
|
||||
In order to estimate the model parameters, the
|
||||
system will be trained with statistics about taste
|
||||
and preferences given a set of dishes with ratings
|
||||
from multiple users. From that information we can directly calculate
|
||||
the probabilities for the ingredients.
|
||||
the probabilities for the ingredients using Maximum Likelihood Learning \cite{murphy}.
|
||||
|
||||
|
||||
%\subsection*{Meal Optimization}
|
||||
When learning the rest of the variables (that are not observed and therefore
|
||||
hidden / latent) we will use Expectation Maximization \cite{russelnorvig}.
|
||||
In order to model food preferences, we implemented
|
||||
a baysian net library in java. The library
|
||||
uses the sum-product algorithm for
|
||||
inference and maximum likelihood learning
|
||||
for parameter estimation. In our implementation
|
||||
we support discrete as well as continous
|
||||
probability distributions. Discrete distributions
|
||||
can be modeled as tables or as trees.
|
||||
In our implementation only continous distributions with discrete parents
|
||||
are supported. A continous distribution is then modeled as a mapping
|
||||
of all possible combination of it' s parents to a gaussian.
|
||||
Given a data set, the parameters of a discrete variable $X$ are
|
||||
estimated as
|
||||
\begin{align}
|
||||
P(X = x| Y_1 = y_1, ... Y_2 = y2) =\\
|
||||
N(X = x| Y_1 = y_1, ... Y_2 = y2) \over N(Y_1 = y_1, ... Y_2 = y2)
|
||||
\end{align}
|
||||
where $N(A)$ is the number of times event $A$occurs in the data set.
|
||||
We decided to implement our own Library,
|
||||
so we understand what is going on and
|
||||
we can debug and fix the models
|
||||
and algorithms easily.
|
||||
|
||||
\section*{Evaluation}
|
||||
The application model will be trained using a sparse subset (25-50\%) of the survey data and the optimization problem soled for the inferred constraints.
|
||||
|
||||
Reference in New Issue
Block a user