Work in progress - rough draft.

This commit is contained in:
Woody Folsom
2012-03-12 20:33:42 -04:00
parent 03201217a1
commit 53fddc965a
7 changed files with 236 additions and 176 deletions

View File

@@ -59,47 +59,41 @@ rating given our model.
In the following we will describe our approach in detail.
First we will discuss the data selection, then the
modeling of the user preference and in the
end how to train the modeled net from
gathered data and howe to predict the
value for a new recepice.
end how we trained the modeled net from
gathered data and predicted the
value for different recipes.
\paragraph*{Data accuisition}
We accumulated a diverse collection of sample recipes using the open source AnyMeal application.
We converted to the freely available MealMaster format (flat file)
recipes to XML format for input into our application.
We will gathered data representing several diners' preference for
approximately 20 meals using a simple survey of the type 'rate on a
scale of 1 to 10, 10 being favorite and 1 being least favorite'.
Furthermore we collected data for vegetarians and allergies.
We first accumulated a diverse collection of license-free sample recipes from web sites such as \emph{Darkstar's Meal-Master Recipes} (http://home.earthlink.net/~darkstar105/). Next, we converted these recipes from flat text files to well-formed XML using the Krecipes application for Debian Linux. Finally, we created a representative data set representing several diners' preference for
24 of these recipes, using a simple survey of the type 'rate on a
scale of 1 to 10, 10 being favorite and 1 being least favorite'. Furthermore, users were allowed to specify a vegetarian or nut-free meal preference.
%daniel is here
\paragraph*{Knowledge Engineering}
We model the diners' various taste preferences preferences using
a Bayes net. In the first layer in Figure \ref{img:bayes_net}
we capture control variables such as vegarian or allergy to nuts.
These were modeled as boolean variables, with 'true' indicating the presence of a constraint.
We model users' preferences using
a bayes net. In the first layer in Figure \ref{img:bayes_net}
we capture controll variables like vegan or allergies.
Those are modeled as boolean variables.
If the user is allergic or a vegetarian, it will set the variable allergic
to 0.
In order to model
Our model consists of 4 layers,
each modeling a different aspect of taste and needs.
In the first layer we capture general meal preferences, like
being vegetarian or not liking your food steamed.
The second layer models a general preference towards
different food categories like vegetables or beef.
each modeling a different aspect of taste or dietary requirement.
\begin{description}
\item[Layer 1] General meal preferences such as
being vegetarian or being allergic to nuts.
\item[Layer 2] The second layer models a general preference towards
different food categories like vegetables or meat.
As one can see, the food categories are dependent
on the general meal preference. For example
being vegetarian will exclude beef and will
support vegetables. The third category models
different ingredients. Each ingredient is conditioned
by the food category it belongs to.
In the last layer we have hard constraints like allergies
(that will exclude a particular ingredient) or
the overall calorie content of the meal given
someone suffers from diabetes.
being vegetarian will exclude meat and will
support vegetables.
\item[Layer 3] Specific flavors and ingredients. Each ingredient is conditioned
by the food category to which it belongs.
\end{description}
The overall net is shown in Figure \ref{img:bayes_net}.
Given a recipe with a list of ingredients $I = i_1,...,i_n$
and a Bayesian network capturing user preferences
@@ -124,7 +118,10 @@ the probabilities for the ingredients using Maximum Likelihood Learning \cite{mu
%\subsection*{Meal Optimization}
In order to model food preferences, we implemented
a baysian net library in java. The library
a custom Baysian net library in java with minimal use of third party libraries (e.g. for XML input).
We chose to implement our own Library, for maximum flexibility and to ensure that the learning algorithm functions precisely as follows:
The library
uses the sum-product algorithm for
inference and maximum likelihood learning
for parameter estimation. In our implementation
@@ -141,14 +138,73 @@ P(X = x| Y_1 = y_1, ... Y_2 = y2) =\\
N(X = x| Y_1 = y_1, ... Y_2 = y2) \over N(Y_1 = y_1, ... Y_2 = y2)
\end{align}
where $N(A)$ is the number of times event $A$occurs in the data set.
We decided to implement our own Library,
so we understand what is going on and
we can debug and fix the models
and algorithms easily.
\section*{Evaluation}
The application model will be trained using a sparse subset (25-50\%) of the survey data and the optimization problem soled for the inferred constraints.
Next, we will calculate the correlation between the application's ranking of all dishes and the actual ranking as determined by the user surveys. We suggest that a high degree of correlation indicates that the system has the potential to accurately appraise constrained group food preferences for dishes which are not part of the survey, given sufficiently detailed recipe information.
The application model will be trained using a sparse subset (50\%) of the survey data and the optimization problem solved for the inferred constraints. As shown below, the calculated preferences for recipes which were not used to train the Bayes net are quite close to the actual survey data, which essentially reflects the following preferences (Sample ratings are on a 1-10 scale):
\begin{description}
\item[Diner 1] No allergies, prefers all dishes equally (5)
\item[Diner 2] Vegetarian, meat dishes are (1), remainder are (9)
\item[Diner 3] Nut Alleregy, prefers meat (6) to vegetarian (4) to desert (3)
\item[Diner 4] No allergies, prefers Pork and Desserts (9), remainder are (3)
\end{description}
Next, we calculate the correlation between the application's ranking of all dishes and the actual ranking as determined by the user surveys. We suggest that a high degree of correlation indicates that the system has the potential to accurately appraise constrained group food preferences for dishes which are not part of the survey, given sufficiently detailed recipe information. As \ref{rms-table} shows, the estimated food preferences are quite close to the actual mean ratings over all diners for the dishes which were not used to train the Bayes net. The root mean-square-error for calculated vs. surveyed meal preferences is approximately 1.0.
%% actual data from non-constrained (Allergy etc) trial run
% TASTE for Southwest Smoothie [DAIRY] : 5 0.14272807979501637
% TASTE for Bayou Shrimp Creole [TOMATO] : 5 0.015864055945522846
% TASTE for Crab Burgers [EGGS] : 5 0.14272807979501637
% TASTE for Broiled Flounder [GENERIC_NUTS, EGGS] : 5 0.14272807979501637
% TASTE for Baked Steak And Lima Beans [TOMATO, SUGAR] : 5 0.015864055945522846
% TASTE for Eggplant Lasagna [GLUTEN] : 5 0.14272807979501637
% TASTE for Salisbury Steak with Mushroom Sauce [GLUTEN, DAIRY, BEEF] : 4 0.020337720115187766
% TASTE for Meatless Loaf [SPICE] : 5 0.14272807979501637
% TASTE for Lemon Pork Chops [PORK, SUGAR] : 5 0.006727267528344914
% TASTE for Fava Bean Burgers [EGGS, POTATO] : 3 0.005331713336331368
% TASTE for Angel Hair Pesto Primavera [GENERIC_NUTS, SPICE] : 5 0.14272807979501637
% TASTE for Kahlua Cake [] : 5 0.14272807979501637
%%
%\begin{center}
\begin{tabular}{ | l | l | l | }
\hline
Southwest Smoothie: & &\\
DAIRY & 1.2 & 5.5 \\ \hline
Bayou Shrimp Creole: & &\\
TOMATO & & 3.75 \\ \hline
Crab Burgers: & &\\
EGGS & & 3.75 \\ \hline
Broiled Flounder: & &\\
GENERIC NUTS, EGGS & & 3.75 \\ \hline
Baked Steak And Lima Beans: & &\\
TOMATO, SUGAR & & 3.75 \\ \hline
Eggplant Lasagna: & &\\
GLUTEN & & 5.25 \\ \hline
Salisbury Steak with Mushroom Sauce: & &\\
GLUTEN, DAIRY, BEEF & & 3.75 \\ \hline
Meatless Loaf: & &\\
SPICE & & 5.25 \\ \hline
Lemon Pork Chops: & &\\
PORK, SUGAR & & 5.25 \\ \hline
Fava Bean Burgers: & &\\
EGGS, POTATO & & 5.25 \\ \hline
Angel Hair Pesto Primavera: & &\\
GENERIC NUTS, SPICE & & 5.25 \\ \hline
Kahlua Cake & & 4.5 \\ \hline
\label{rms-table}
\end{tabular}
%\end{center}
\begin{figure}[h!]
\centering
\includegraphics[width=0.5 \textwidth]{BayesChefChart.png}
\caption{Estimated vs. surveyed dish ratings}
\end{figure}
\bibliographystyle{plain}
\bibliography{p2refs}