Here's Clark's thoughts: Perhaps we humans, and a great many other organisms, too, are deploying a fundamental, thrifty, prediction-based strategy that husbands neural resources and (as a direct result) delivers

Methods of Measuring Error Adjusted R2 The R2 measure is by far the most widely used and reported measure of error and goodness of fit. Notice that all of the prediction errors have a mean of -5.1 (for a line with a slope of 1.).

The first part ($-2 ln(Likelihood)$) can be thought of as the training set error rate and the second part ($2p$) can be though of as the penalty to adjust for the

The opposite of understanding is confusion, which is not knowing which model can reasonably be appealed to. So, at least on its face, one would expect to find diversity in the basic (nontrivial) principles governing these modules (with some modules doing PEM and others using some other information-processing Measuring Error When building prediction models, the primary goal should be to make a model that most accurately predicts the desired target value for new data.

This is a fundamental property of statistical models. Both different mechanisms are in service of reducing prediction error but the means to do so is different. Yes, you can accommodate cases like the dating one by appealing to the "right" level of the temporal hierarchy, but it starts to sound like you can accommodate any data by Mean Squared Prediction Error The AIC formulation is very elegant.

In this case, your error estimate is essentially unbiased but it could potentially have high variance. How could PEM deal with scenarios where people actively seek novelty or surprising situations?

A good model also captures the given input in a minimally complex fashion, without too many unnecessary parameters.

And I can't think of any other theoretical framework that comes even close to this. Overfitting may give decent momentary or short term prediction error minimization but is bound to fail in the long run. At these high levels of complexity, the additional complexity we are adding helps us fit our training data, but it causes the model to do a worse job of predicting new

In general, the notion of modularity is of course still debated. In addition to those priors learned empirically some constraints (enabling and otherwise) on priors will result from phylogenetic and environmental factors.

While the proposals in the perception literature seem straightforward enough that they could be implemented neurophysiologically, I'm worried that your more ambitious proposal takes us away from anything realistically implementable. Does this mean I don't expect it to occur and not be acted on?

For instance, in the illustrative example here, we removed 30% of our data. Holdout data split.

What the dark room problem tells us is that prediction error minimization always happens given a model, a set of expectations. Ultimately, it appears that, in practice, 5-fold or 10-fold cross-validation are generally effective fold sizes. I conceive of understanding as having a reasonable model for making sense of a domain, even if there is still uncertainty about the states of the domain.

However, in addition to AIC there are a number of other information theoretic equations that can be used. But, don't we already have such a principle in representation? Happiness is just the absence of prediction error, after all… However, I do think that this is what makes PEM worth the effort.

The simplest of these techniques is the holdout set method. The model solutions list estimation variance that reflects in perfections derived from the model due to a number of reasons: one of which being the choice of model itself. They are honed in empirical Bayes, that is they are learned from prior experience.

A prediction error minimisation system (scheme) does not aim for perfect mirroring, to do so would lead to an unfit system as you point out. Hunger, thirst, loneliness are all states we don't expect in the long run, so they are surprising. Of course the true model (what was actually used to generate the data) is unknown, but given certain assumptions we can still obtain an estimate of the difference between it and

Even at a more cognitive level, which might be closer to the level of description you are aiming for, there will be differences. You're right about the challenge to folk psychology! There is however a very direct way to link action and adaptive fitness (set out in the papers Bryan links to) but going that route involves accepting the free energy principle

I make a comparison with Lamarckism… (The following taken from http://headbirths.wordpress.com/talks/intelligence-and-the-brain/ ) "I would suggest that Free Energy[/(PEM)] is currently where evolution was in around the year 1800 – around the