163 lines
8.1 KiB
TeX
163 lines
8.1 KiB
TeX
\documentclass[times, 08pt,twocolumn]{article}
|
|
\usepackage{latex8}
|
|
|
|
\usepackage{titlesec}
|
|
% \usepackage[margin=0.5in]{geometry}
|
|
\usepackage{graphicx}
|
|
\usepackage{amsmath}
|
|
|
|
\titleformat{\section}{\large\bfseries}{\thesection}{1em}{}
|
|
|
|
\begin{document}
|
|
\pagestyle{empty}
|
|
|
|
\title{A Bayesian Approach to Collaborative Dish Selection}
|
|
\author{Team 10}
|
|
\date{February 23, 2012}
|
|
|
|
\maketitle
|
|
|
|
\section*{Introduction}
|
|
As anyone who has ever planned a catered event can attest, attempting
|
|
to satisfy the various palates, dietary requirements and tastes of a
|
|
group of diners can be a daunting task. This is particularly true
|
|
given the exponential number of dishes which can be created from a
|
|
small number of ingredients, as well has hard constraints such as
|
|
allergies and religious beliefs. Many professional catering services
|
|
handle this problem by allowing guests to select from a very limited
|
|
menu. We introduce a dish recommendation system
|
|
based on Bayesian Networks modeling user preferences.
|
|
We predict the meals from a data base of recepices that most likely match the varied tastes
|
|
of the customers, using a limited set of ingredients. This type of expert system
|
|
would be of great use to a catering service or restaurant which needs to rapdily decide on
|
|
a small number of dishes which would be acceptable for a large dinner party,
|
|
given diverse requirements and preferences.
|
|
|
|
\section*{Related Work}
|
|
Boekel and Corney propose using Bayesian Networks to model
|
|
consumer needs in food production chains \cite{vanboekel} \cite{corney}.
|
|
Janzen and Xiang propose an intelligent refrigerator capable of
|
|
generating meal plans based on inventory
|
|
and past food choices \cite{janzenxiang}. Baysian networks have also been
|
|
applied to recommendation systems before in on-line social
|
|
networks \cite{truyen} making predictions of the form
|
|
``if you bought those items what is the probability you would like to
|
|
buy that''. We suggest that these approaches are limited in that they
|
|
only consider the preferences of a single (or supposed 'typical') user rather than a group.
|
|
|
|
\section*{Approach}
|
|
|
|
The approached problem is to pick a single meal which best meets the requirements
|
|
and tastes of different people dining together. We learn a predictive
|
|
baysian net from a survey distributed to participants of the meal as
|
|
training data in order to capture their preferences. The dishes
|
|
in the questionaire are selected such that all ingrediants
|
|
are covered. The participants rate each dish on a scale from
|
|
one to ten and give additional information like vegetarians.
|
|
For new dishes we then predict the maximum likelihood
|
|
rating given our model.
|
|
In the following we will describe our approach in detail.
|
|
First we will discuss the data selection, then the
|
|
modeling of the user preference and in the
|
|
end how we trained the modeled net from
|
|
gathered data and predicted the
|
|
value for different recipes.
|
|
|
|
\paragraph*{Data accuisition}
|
|
We first accumulated a diverse collection of license-free sample recipes from web sites such as \emph{Darkstar's Meal-Master Recipes} (http://home.earthlink.net/~darkstar105/). Next, we converted these recipes from flat text files to well-formed XML using the Krecipes application for Debian Linux. Finally, we created a representative data set representing several diners' preference for
|
|
24 of these recipes, using a simple survey of the type 'rate on a
|
|
scale of 1 to 10, 10 being favorite and 1 being least favorite'. Furthermore, users were allowed to specify a vegetarian or nut-free meal preference.
|
|
|
|
%daniel is here
|
|
\paragraph*{Knowledge Engineering}
|
|
We model the diners' various taste preferences using
|
|
a Bayes net. We model the taste
|
|
|
|
\begin{description}
|
|
\item[Layer 1] The first layer models a general preference towards
|
|
different food categories like vegetables or meat.
|
|
As one can see, the food categories are dependent
|
|
on the general meal preference. For example
|
|
being vegetarian will exclude meat and will
|
|
support vegetables.
|
|
|
|
\item[Layer 2] Specific flavors and ingredients. Each ingredient is conditioned
|
|
by the food category to which it belongs.
|
|
\end{description}
|
|
|
|
If we need to model hard constraints, like
|
|
|
|
|
|
The overall net is shown in Figure \ref{img:bayes_net}.
|
|
Given a recipe with a list of ingredients $I = i_1,...,i_n$
|
|
and a Bayesian network capturing user preferences
|
|
we can calculate the probability of users liking the dish given
|
|
the probabilities of liking each ingrediant.
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=\linewidth]{bayes}
|
|
\caption{Our Baysian net modeling user preferences}
|
|
\label{img:bayes_net}
|
|
\end{figure}
|
|
|
|
%\subsection*{implementation}
|
|
\paragraph*{Learning and Predicting}
|
|
In order to estimate the model parameters, the
|
|
system will be trained with statistics about taste
|
|
and preferences given a set of dishes with ratings
|
|
from multiple users. From that information we can directly calculate
|
|
the probabilities for the ingredients using Maximum Likelihood Learning \cite{murphy}.
|
|
|
|
|
|
%\subsection*{Meal Optimization}
|
|
In order to model food preferences, we implemented
|
|
a custom Baysian net library in java with minimal use of third party libraries (e.g. for XML input).
|
|
We chose to implement our own Library, for maximum flexibility and to ensure that the learning algorithm functions precisely as follows:
|
|
|
|
The library
|
|
uses the sum-product algorithm for
|
|
inference and maximum likelihood learning
|
|
for parameter estimation. In our implementation
|
|
we support discrete as well as continous
|
|
probability distributions. Discrete distributions
|
|
can be modeled as tables or as trees.
|
|
In our implementation only continous distributions with discrete parents
|
|
are supported. A continous distribution is then modeled as a mapping
|
|
of all possible combination of it' s parents to a gaussian.
|
|
Given a data set, the parameters of a discrete variable $X$ are
|
|
estimated as
|
|
\begin{align}
|
|
P(X = x| Y_1 = y_1, ... Y_2 = y2) =\\
|
|
N(X = x| Y_1 = y_1, ... Y_2 = y2) \over N(Y_1 = y_1, ... Y_2 = y2)
|
|
\end{align}
|
|
where $N(A)$ is the number of times event $A$occurs in the data set.
|
|
|
|
\section*{Evaluation}
|
|
The application model will be trained using a sparse subset (50\%) of the survey data and the optimization problem solved for the inferred constraints. As shown below, the calculated preferences for recipes which were not used to train the Bayes net are quite close to the actual survey data, which essentially reflects the following preferences (Sample ratings are on a 1-10 scale):
|
|
|
|
\begin{description}
|
|
\item[Diner 1] No allergies, prefers all dishes equally (5)
|
|
|
|
\item[Diner 2] Vegetarian, meat dishes are (1), remainder are (9)
|
|
|
|
\item[Diner 3] Nut Alleregy, prefers meat (6) to vegetarian (4) to desert (3)
|
|
|
|
\item[Diner 4] No allergies, prefers Pork and Desserts (9), remainder are (3)
|
|
|
|
\end{description}
|
|
|
|
Next, we calculate the correlation between the application's ranking of all dishes and the actual ranking as determined by the user surveys. We suggest that a high degree of correlation indicates that the system has the potential to accurately appraise constrained group food preferences for dishes which are not part of the survey, given sufficiently detailed recipe information. As \ref{rms-table} shows, the estimated food preferences are quite close to the actual mean ratings over all diners for the dishes which were not used to train the Bayes net. The root mean-square-error for calculated vs. surveyed meal preferences is approximately 1.0.
|
|
|
|
\begin{figure}[h!]
|
|
\centering
|
|
\includegraphics[width=0.5 \textwidth]{BayesChefChart.png}
|
|
\caption{Estimated vs. Actual Survey Dish Ratings}
|
|
\end{figure}
|
|
|
|
Note the outlier at Dish \#2 (Bayou Shrimp Creole). The strong preference for this dish is a result of the ingredient list containing primarily shrimp and tomato. Unlike beef and pork, the seafood category was not implemented in the knowledge enginerring of the net. Consequently, this dish is incorrectly deemed to be vegetarian-compatible. The same issue had previously occurred at Dish \#5 (Baked Steak and Lima Beans) until 'steak' was added to the recipe parser as a synonym for beef, and therefore a type of meat.
|
|
|
|
\bibliographystyle{plain}
|
|
\bibliography{p2refs}
|
|
|
|
\end{document} |