Work in progress - rough draft.

2012-03-12 20:33:42 -04:00
parent 03201217a1
commit 53fddc965a
7 changed files with 236 additions and 176 deletions
--- a/Proposal.tex
+++ b/Proposal.tex
@@ -59,47 +59,41 @@ rating given our model.
 In the following we will describe our approach in detail.
 First we will discuss the data selection, then the
 modeling of the user preference and in the
-end how to train the modeled net from
-gathered data and howe to predict the 
-value for a new recepice.
+end how we trained the modeled net from
+gathered data and predicted the 
+value for different recipes.

 \paragraph*{Data accuisition}
-We accumulated a diverse collection of sample recipes using the open source AnyMeal application.
-We converted to the freely available MealMaster format (flat file)
-recipes to XML format for input into our application.
-We will gathered data representing several diners' preference for
-approximately 20 meals using a simple survey of the type 'rate on a
-scale of 1 to 10,  10 being favorite and 1 being least favorite'.  
-Furthermore we collected data for vegetarians and allergies.
+We first accumulated a diverse collection of license-free sample recipes from web sites such as \emph{Darkstar's Meal-Master Recipes} (http://home.earthlink.net/~darkstar105/).  Next, we converted these recipes from flat text files to well-formed XML using the Krecipes application for Debian Linux.  Finally, we created a representative data set representing several diners' preference for
+24 of these recipes, using a simple survey of the type 'rate on a
+scale of 1 to 10,  10 being favorite and 1 being least favorite'.  Furthermore, users were allowed to specify a vegetarian or nut-free meal preference.

 %daniel is here
 \paragraph*{Knowledge Engineering}

+We model the diners' various taste preferences preferences using
+a Bayes net. In the first layer in Figure \ref{img:bayes_net}
+we capture control variables such as vegarian or allergy to nuts.
+These were modeled as boolean variables, with 'true' indicating the presence of a constraint.

-We model users' preferences using
-a bayes net. In the first layer in Figure \ref{img:bayes_net}
-we capture controll variables like vegan or allergies.
-Those are modeled as boolean variables. 
-If the user is allergic or a vegetarian, it will set the variable allergic
-to 0. 
-
-In order to model 
 Our model consists of 4 layers, 
-each modeling a different aspect of taste and needs.
-In the first layer we capture general meal preferences, like
-being vegetarian or not liking your food steamed.
-The second layer models a general preference towards 
-different food categories like vegetables or beef.
+each modeling a different aspect of taste or dietary requirement.
+\begin{description}
+\item[Layer 1] General meal preferences such as
+being vegetarian or being allergic to nuts.
+
+\item[Layer 2] The second layer models a general preference towards 
+different food categories like vegetables or meat.
 As one can see, the food categories are dependent 
 on the general meal preference. For example 
-being vegetarian will exclude beef and will 
-support vegetables. The third category models 
-different ingredients. Each ingredient is conditioned 
-by the food category it belongs to.
-In the last layer we have hard constraints like allergies
-(that will exclude a particular ingredient) or 
-the overall calorie content of the meal given
-someone suffers from diabetes.
+being vegetarian will exclude meat and will 
+support vegetables.
+
+\item[Layer 3]  Specific flavors and ingredients. Each ingredient is conditioned 
+by the food category to which it belongs.
+
+\end{description}
+
 The overall net is shown in Figure \ref{img:bayes_net}.
 Given a recipe with a list of ingredients $I = i_1,...,i_n$ 
 and a Bayesian network capturing user preferences 
@@ -124,7 +118,10 @@ the probabilities for the ingredients using Maximum Likelihood Learning \cite{mu

 %\subsection*{Meal Optimization}
 In order to model food preferences, we implemented
-a baysian net library in java. The library
+a custom Baysian net library in java with minimal use of third party libraries (e.g. for XML input).
+We chose to implement our own Library, for maximum flexibility and to ensure that the learning algorithm functions precisely as follows:
+
+The library
 uses the sum-product algorithm for
 inference and maximum likelihood learning
 for parameter estimation. In our implementation
@@ -141,14 +138,73 @@ P(X = x| Y_1 = y_1, ... Y_2 = y2) =\\
 N(X = x| Y_1 = y_1, ... Y_2 = y2) \over N(Y_1 = y_1, ... Y_2 = y2)
 \end{align}
 where $N(A)$ is the number of times event $A$occurs in the data set.
-We decided to implement our own Library,
-so we understand what is going on and
-we can debug and fix the models
-and algorithms easily.

 \section*{Evaluation}
-The application model will be trained using a sparse subset (25-50\%) of the survey data and the optimization problem soled for the inferred constraints.
-Next, we will calculate the correlation between the application's ranking of all dishes and the actual ranking as determined by the user surveys.  We suggest that a high degree of correlation indicates that the system has the potential to accurately appraise constrained group food preferences for dishes which are not part of the survey, given sufficiently detailed recipe information.
+The application model will be trained using a sparse subset (50\%) of the survey data and the optimization problem solved for the inferred constraints.  As shown below, the calculated preferences for recipes which were not used to train the Bayes net are quite close to the actual survey data, which essentially reflects the following preferences (Sample ratings are on a 1-10 scale):
+
+\begin{description}
+\item[Diner 1] No allergies, prefers all dishes equally (5)
+
+\item[Diner 2] Vegetarian, meat dishes are (1), remainder are (9)
+
+\item[Diner 3] Nut Alleregy, prefers meat (6) to vegetarian (4) to desert (3)
+
+\item[Diner 4] No allergies, prefers Pork and Desserts (9), remainder are (3)
+
+\end{description}
+
+Next, we calculate the correlation between the application's ranking of all dishes and the actual ranking as determined by the user surveys.  We suggest that a high degree of correlation indicates that the system has the potential to accurately appraise constrained group food preferences for dishes which are not part of the survey, given sufficiently detailed recipe information.  As \ref{rms-table} shows, the estimated food preferences are quite close to the actual mean ratings over all diners for the dishes which were not used to train the Bayes net.  The root mean-square-error for calculated vs. surveyed meal preferences is approximately 1.0.
+
+%% actual data from non-constrained (Allergy etc) trial run
+% TASTE for Southwest Smoothie [DAIRY] : 5 0.14272807979501637
+% TASTE for Bayou Shrimp Creole [TOMATO] : 5 0.015864055945522846
+% TASTE for Crab Burgers [EGGS] : 5 0.14272807979501637
+% TASTE for Broiled Flounder [GENERIC_NUTS, EGGS] : 5 0.14272807979501637
+% TASTE for Baked Steak And Lima Beans [TOMATO, SUGAR] : 5 0.015864055945522846
+% TASTE for Eggplant Lasagna [GLUTEN] : 5 0.14272807979501637
+% TASTE for Salisbury Steak with Mushroom Sauce [GLUTEN, DAIRY, BEEF] : 4 0.020337720115187766
+% TASTE for Meatless Loaf [SPICE] : 5 0.14272807979501637
+% TASTE for Lemon Pork Chops [PORK, SUGAR] : 5 0.006727267528344914
+% TASTE for Fava Bean Burgers [EGGS, POTATO] : 3 0.005331713336331368
+% TASTE for Angel Hair Pesto Primavera [GENERIC_NUTS, SPICE] : 5 0.14272807979501637
+% TASTE for Kahlua Cake [] : 5 0.14272807979501637
+%%
+
+%\begin{center}
+ \begin{tabular}{ | l | l | l | }
+    \hline
+Southwest Smoothie: & &\\
+DAIRY & 1.2 & 5.5 \\ \hline
+Bayou Shrimp Creole:  & &\\
+TOMATO & & 3.75 \\ \hline
+Crab Burgers:  & &\\
+EGGS & & 3.75 \\ \hline
+Broiled Flounder:  & &\\
+GENERIC NUTS, EGGS & & 3.75 \\ \hline
+Baked Steak And Lima Beans:  & &\\
+TOMATO, SUGAR & & 3.75 \\ \hline
+Eggplant Lasagna:  & &\\
+GLUTEN & & 5.25 \\ \hline
+Salisbury Steak with Mushroom Sauce:  & &\\
+GLUTEN, DAIRY, BEEF & & 3.75 \\ \hline
+Meatless Loaf:  & &\\
+SPICE & & 5.25 \\ \hline
+Lemon Pork Chops:  & &\\
+PORK, SUGAR & & 5.25 \\ \hline
+Fava Bean Burgers:  & &\\
+EGGS, POTATO & & 5.25 \\ \hline
+Angel Hair Pesto Primavera:  & &\\
+GENERIC NUTS, SPICE & & 5.25 \\ \hline
+Kahlua Cake & & 4.5 \\ \hline
+\label{rms-table}
+\end{tabular}
+%\end{center}
+
+\begin{figure}[h!]
+\centering
+\includegraphics[width=0.5 \textwidth]{BayesChefChart.png}
+\caption{Estimated vs. surveyed dish ratings}
+\end{figure}

 \bibliographystyle{plain}
 \bibliography{p2refs}