11/3/16

Cortical trade-offs in generalist vs. specialist bias


Neocortex is a network of neurons, loosely organized in a hierarchy of generalization. In this hierarchy, stimuli selectively propagate from primary to association areas and cortices ("Cortex & Mind", Cortical Memory by Joaquin Fuster). Higher areas select for stimuli of greater spatial and temporal scope. In such generalization, lower-level stimuli are sequentially replaced by their collective “representations”.

Most of neocortex is connections between neurons (dendrites and axons), plus their life support. Given limited resources within a skull, there must be a tradeoff between total number of connections and their average length. In other words, cortex can be relatively dense, with more numerous connections of shorter average length, or sparse, with fewer total connections of greater average length.

Generalization is just another word for pattern discovery, and basic pattern is a coincidence of multiple inputs. For a neuron, the inputs are presynaptic spikes. Sufficient number of coincident spikes triggers Hebbian learning: “fire together, wire together“ between simultaneously spiking neurons. More precisely, a synapse is strengthened if pre-synaptic neuron fires just before the post-synaptic one.

The choice of coincident inputs becomes exponentially greater with the length of connections. Hence, stronger patterns (greater number and closer timing of coincident input spikes) can be discovered. But that greater range must come at the cost of having fewer total connections, representing lesser detail. Which requires greater selectivity in learning: longer reinforcement to form and strengthen synapses.

So, other things being equal, there must be a tradeoff between speed and detail of learning, and relative  scope and stability of learned patterns. The former is prioritized in a dense hierarchy (my “specialist bias”), and the latter in a sparse hierarchy (my “generalist bias”).


Cellular factors in density vs. range trade-off.


Initial determinant of cortical “density” is the rate of division and survival for neuronal progenitor cells during embryonic and perinatal cortical development. Slower division or faster die-off would leave fewer progenitor cells, which in turn produce fewer cortical neurons. That should leave more space and resources to grow correspondingly longer axons and dendrites between them. This is likely determined by nerve growth factors and their receptors, - higher activity would produce denser network. One such factor may be coded by CATNAP2 gene, expressed mostly in prefrontal and parietal cortices, and probably correlated with autism: Genes for autism or genes for connectivity.

The most recognizable feature of neocortex is its six layers. Deeper and older layers VI and V mostly mediate cortico-subcortical integration, layer IV propagates data flow upward the cortical hierarchy via thalamus, and newer layers II and III provide intra-cortical connectivity, mostly via layer I axons.
Henry Markram recently reported innate ”peak connectivity” of layer V pyramidal cells at 300-500 mu. There are ~50 cell clusters (representation units) interlaced within that distance. It seems to me that these clusters provide minimal representation redundancy and mutual support via reverberating firing. Each cluster probably responds to some specific intensity of stimulus. These clusters inhibit each other within a column: “Sparse distributed coding model…“ to adjust for redundancy within receptive field.
So, variation in the range of such peak connectivity may be one of main factors in dense vs. sparse bias.

A unique feature in human brain (and to a lesser extent in other primates and whales) is spindle cells. Wikipedia: "Spindle cells emerge postnatally and eventually become widely connected with diverse parts of the brain, evidencing their essential contributions to the superior capacity of hominids to focus on difficult problems." Axons of spindle cells are less branched than those of pyramidal neurons, and their extended range must come at the expense of reduced density of other connections. This trade-off probably enables better top-down (general-to-specific) focus in humans.

Another possible factor in the trade-off is the ratio of glia to neurons, which I think is also a good sign of a "sparse" architecture. This is an excerpt from the "The Root of Thought": "As we move up the evolutionary ladder, in a widely researched worm, Caenorhabditis elegans, glia are 16 percent of the nervous system. The fruit fly’s brain has about 20 percent glia. In rodents such as mice and rats, glia make up 60 percent of the nervous system. The nervous system of the chimpanzee has 80 percent glia, with the human at 90 percent. The ratio of glia to neurons increases with our definition of intelligence."


However, his interpretation that glia a main information processing component in human brain is implausible, I agree with mainstream opinion that they mainly provide support for neurons.
Greater proportion of glia would reduce density of neurons, but enable higher activity and longer-range connections for each of them. Again, that means a sparser network.

I might be wrong on much of the specifics (above and below), but that wouldn't affect the "sparse vs. dense" premise itself. This premise depends only on the assumption that neocortex is a hierarchy of generalization for the patterns of stimuli, and I can't think of any alternative to that. The trade-off may differ among cortical areas, but I think specificity of such differences is relatively low. That’s because the number of progenitor cells in the cortical sub-plate is set before any significant differentiation between the areas. Also, genetic variation among individuals is very minor, and whatever differences develop through postnatal learning are already affected by innate biases.


Differences between higher and lower cortical regions


Joaquin Fuster traced the differences between primary & hierarchically higher association areas in Cortex & Mind, p. 73: “At the lower level, representation is highly concrete and localized, and thus highly vulnerable. Local damage leads to well-delimited sensory deficit. In unimodal association cortex, representation is more categorical and more distributed, in networks that span relatively large sections of that cortex… In transmodal areas representation is even more widely distributed… P. 82: “Thus a higher-level cognit (e.g., an abstract concept) would be represented in a wide network of association cortex…” In my terms, wider networks imply “sparse bias” on higher levels of generalization.

Similar quatation from “How to Create a Mind” by Ray Kurzweil, p. 86: A study of 25 visual and multi-modal cortical areas by Daniel Felleman found that “As they went up the neocortical hierarchy,.. processing of  patterns comprised larger spatial areas and involved longer time periods“.
Another study by Uri Hasson stated: “It is well established that neurons along the visual cortical pathways have increasingly larger spatial receptive field.” and found that “similar to cortical hierarchy of spatial receptive fields, there is a hierarchy of progressively longer temporal receptive windows”. 

It’s also suggestive that parietal cortex seems to have higher-frequency brain waves than prefrontal one: Buschman & Miller of MIT,
ref:have found two types of attention in two separate regions of the brain. The prefrontal cortex is in charge of willful concentration; if you are studying for a test or writing a novel, the impetus and the orders come from there. But if there is a sudden, riveting event—the attack of a tiger or the scream of a child—it is the parietal cortex that is activated. The MIT scientists have learned that the two brain regions sustain concentration when the neurons emit pulses of electricity at specific rates—faster frequencies for the automatic processing of the parietal cortex, slower frequencies for the deliberate, intentional work of the prefrontal." I think higher-frequency waves are associated with reaction speed and detail, and lower frequency with a longer feedback loop of higher levels.

The neocortex is myelinated sequentially from primary to association areas at correspondingly increasing age (up to ~30 year old for prefrontal cortex), & myelination then seems to decline in reversed order ("Human Neurophysiology", page 197). Allowing for a multi-year delay in knowledge accumulation, this probably reflects &/or determines the age at which abilities peak in fields that require knowledge of corresponding generality. It's known that athletic abilities (primary cortices) peak in early 20s, and mathematical skills (likely parietal cortex) a bit latter.

On the other hand, performance in business, politics, social sciences, and literature (anterior prefrontal cortex?) doesn't peak until late in life. This is probably even more true in philosophy, but performance metrics there are questionable. Also supportive is the observation that this cortical development sequence is delayed by several years in subjects with ADHD. Obviously, effective generality of discovered concepts, thus also development of higher association areas, depends on attention span.

The differences among cortical regions also include cortical hemispheric asymmetry. It appears that left hemisphere represents higher-generality concepts, especially semantic ones. The right hemisphere works mostly in the background, likely searching for lower-level contextual patterns (Cortex & Mind, p. 184, Split Brain, Gazzaniga). So, according to my premise left hemisphere should be relatively “sparse”. This is supported in “Cortex & Mind“, p 185: “Pyramidal cells in language areas have been found to be larger on the left than on the right (Hayes & Lewis, 1995; Hustler & Gazzaniga, 1997)”. Their dendritic trees also extend further than those of right-hemisphere pyramids (Jacobs & Schneibel, 1993).

Another study: "Hemispheric asymmetries in cerebral cortical networks" found that columns in left hemisphere contain fewer minicolumns and better myelinated axons than corresponding areas of right hemisphere. Total volume and the number of synapses seems to be the same for both hemispheres. The hemispheres are densely interconnected by Corpus Callosum. Some of this connectivity provides for simple sensory-motor field integration and fault-tolerance. But greater “lateralization” in humans, vs. other primates, suggests that our hemispheres are integrated into combined hierarchy, resulting in ultimately greater power of generalization. A Finish study
found that ambidexterity (correlated with lesser hemispheric asymmetry) doubles the risk of ADHD and lower academic performance in children.


Autism as a dense-connectivity cognitive style.

The best evidence for individual differences in cognitive focus comes from research on autism, which is known to increase attention to details, often at the expense of higher generalization. So, it’s a good proxy for a "specialist phenotype", which according to my thesis should display greater short-range vs. long-range connectivity. Below, I summarize some evidence for such bias in autism.

Casanova in "Abnormalities Of Cortical Circuitry In The Brains Of Autistic Individuals" reports that autistic individuals have more numerous but smaller and more densely packed minicolumns, each containing smaller than normal neurons with shorter axons (I came across it via A Shade of Gray: excellent review of relevant research, highly recommend).
Related study "Comparison of the Minicolumnar Morphometry of Three Distinguished Neuroscientists and Controls" by Casanova is reported in "Minicolumns, Genius, and Autism". The connectivity pattern of the neuroscientists appears to be similar to autistics in the density and size of minicolumns, but different in better inhibitory isolation between adjacent minicolumns. This should compensate for smaller size, while enabling greater number of minicolumns.

Similar finding of more compact and better insulated minicolumns in primates and cetacea, compared to cats and lower mammals, was reported in A comparative perspective on minicolumns and inhibitory GABAergic interneurons in the neocortex.
The thesis of local vs. global connectivity bias in autism is also supported in Exploring the Folds of the Brain--And Their Links to Autism
by Hilgetag and Barbas: "in autistic people, communication between nearby cortical areas increases, whereas communication between distant areas decreases".
Such cortex should be more reliant on cortico-thalamo-cortical vs. cortico-cortical connections, which might be the implication in Partially enhanced thalamocortical functional connectivity in autism.

Henry Markram, a leading neuroscientist, a father of autistic son, and “pretty much an autist myself”, proposed
The intense world theory - a unifying theory of the neurobiology of autism.:

“The proposed neuropathology is hyper-functioning of local neural microcircuits, best characterized by hyper-reactivity and hyper-plasticity. Such hyper-functional microcircuits are speculated to become autonomous and memory trapped leading to the core cognitive consequences of hyper-perception, hyper-attention, hyper-memory and hyper-emotionality. The theory is centered on the neocortex and the amygdala, but could potentially be applied to all brain regions. The severity on each axis depends on the severity of the molecular syndrome expressed in different brain regions, which could uniquely shape the repertoire of symptoms of an autistic child. The progression of the disorder is proposed to be driven by overly strong reactions to experiences that drive the brain to a hyper-preference and overly selective state, which becomes more extreme with each new experience and may be particularly accelerated by emotionally charged experiences and trauma. This may lead to obsessively detailed information processing of fragments of the world and an involuntarily and systematic decoupling of the autist from what becomes a painfully intense world. The autistic is proposed to become trapped in a limited, but highly secure internal world with minimal extremes and surprises.”

"In the early phase of the child's life, repetition is a response to extreme fear. The autist perceives, feels and fears too much. Let them have their routines, no computers, television, no sharp colors, no surprises. It's the opposite of what parents are told to do. We actually think if you could develop a filtered environment in the early phase of life you could end up with an incredible genius child without many of the sensory challenges."

"The main critical periods for the brain during which time circuits form irreversibly are in the first few years (till about the age of 5 or so). We think this is an important age period when autism can either fully express to become a severe handicap or turned to become a major advantage. We think a calm filtered environment will not send the circuits into hyper-active modes, but the brain will keep most of its potential for plasticity. At later ages, filtered environments should help calm the autistic child and give them a starting point from where they can venture out. Each autistic child probably will first need its own bubble environment before one can start mixing bubbles. It should happen mostly on its own, but with very gentle guidance and encouragement. Do all you would want for your child ... but in slow motion ... let the child set the pace ... they need that control to feel secure enough to begin to venture off into any other bubbles."

Recent study found a reason for such intense perception: reduced synaptic spine pruning in autistic brains, secondary to mTOR over-expression. This seems to happen at the age of 3-4 years, when synaptic spine density normally decreases by ~50%. I suspect reduced prunning may also happen during prenatal development. If so, greater synaptic density should increase activity, thereby reducing normal prenatal die-off of neurons. This would explain minicolumnar differences noted by Casanova.

Either way, more numerous synaptic spines increase density of connections in the cortex, which must ultimately come at the expense of their effective range.
Truly pathological autism probably requires more than increased synaptic density and activity. The ultimate cause might be something as basic as pre|post- natal viral infection or retroviral expression, combined with low vitamin D levels (as is likely the case for schizophrenia and bipolar disorder).


Schizophrenia as a sparse-connectivity disorder.


On the opposite end of cognitive spectrum, excessively sparse connectivity in the cortex seems to increase the risk of schizophrenia.

One such risk factor is recently discovered greater ratio of astrocytes to neurons in schizophrenia, specifically in prefrontal cortex (astrocytes is a type of glial cells, covered above). This imbalance, demonstrated in RIKEN study on stem cells and post-mortem brains of patients vs. controls, seems to be due to reduced expression of gene DGCR8.
Fewer neurons means less robust networks, more vulnerable to acute damage, but more astrocytes can better maintain remaining neurons for regular wear and tear of our long life.

Duke University study found a more direct “sparse” risk factor for schitzophrenia: increased synaptic prunning. "‘Spine pruning theory is supported by the observation that the frontal brain regions of people with schizophrenia have fewer dendritic spines, the tentacles on the receiving ends of neurons that process signals from other cells". But this increased prunning happens during puberty, probably secondary to increased testosterone, vs. reduced prunning at 3-4 year old  in autism.

Since schizophrenia is a uniquely human disorder, there must have been a reason for these risk factors to evolve. Other things that are unique for humans are large neocortex, complex society, and long life.
I think this risk and benefits are closely related: decreased density leaves more space and resources (such as astrocytes) for remaining neurons and connections, so they may grow longer. Which enables global intellectual integrity, thus dynamic social coordination and long-term planning in general.

More specific “sparse disorder” may be dyslexia. This connection was also made by Manuel Casanova: “Autism and dyslexia: A spectrum of cognitive styles as defined by minicolumnar Morphometry“, although there is a lot less research on that. Basically, he thinks that dyslexia is caused or exacerbated by a very “lossy” cognitive style, at least in some sensory association cortices.


Implications and speculations


Generalist vs. specialist trade-offs are somewhat ambiguous in terms modern societal utility:
- On one hand, speed & precision was more important for survival in the wild, which may explain why apes seem to have photographic memory, superior to humans: Chimps beat humans in memory test.
- On the other hand, more recent functional differentiation of modern society once again requires increasingly “lossless” knowledge acquisition. Social positions that do require higher generalization are relatively few, including law, management, sales & marketing, politics and related academic disciplines.

In terms of gender, men are obviously over represented among extreme specialists, and even more so among generalists. This should be expected: extremes, especially those in environmental detachment, are quite risky, and risk is a male domain. Males don’t contribute nearly as much to reproduction as females, in some species nothing but their genes. But they have additional or alternative purpose in evolution, - to serve as a test vehicle for variations, initially genetic and lately also memetic.

Relatively speaking, women don’t take chances. They have two X chromosomes to conceal mutations, more symmetrical brains as a backup for damage, stronger immune system and higher HDL. This also applies to behavioral differences: lower testosterone and vasopressin to avoid risk, higher estrogen and oxytocin to seek and provide support, generally heightened senses to pay more attention to their bodies and immediate environment. Another salient difference is recently discovered higher myelination in female thalamus, likely related to faster and more frequent attention switching in women. All that must come at the expense of intellectual detachment and higher generalization.

Paradoxically, generalist bias may also be associated with smaller brain size, due to shorter global links. For example, it is known that low-generality savant abilities can be induced by inhibiting prefrontal cortex, presumably because top-down focus selectively inhibits bottom-up perception. Inversely, shorter distances improve signal propagation across global networks (such as fronto-parietal, fronto-temporal, and salience networks), suppressing bottom-up detail by top-down filtering. Increased selectivity is also necessary to compensate for reduced overall memory capacity of a smaller brain.

Another benefit of smaller size is potentially better quality of development. Given the same time-to-maturity, there is a well-known “slow growth vs. sloppy growth” trade-off in biology. Basically, slower growth allows for more time and effort to prevent and correct mistakes made  during cellular division and other anabolic processes. For example, slower-growing axons are less affected by fluctuations in gradients guiding their growth cones. So, they will be straighter, further reducing the length of connections among cortical areas. And shorter connections are associated with higher IQ.

Of course, this is contrary to conventional bigger-is-better view, supported by increasing brain size in human evolution. But this trend reversed after Neolithic revolution, which might not be a bad thing. Some margin of that increased brain and body size is net-beneficial only for fight or flight emergencies. Given drastically improved security of settled society, that margin should become net-detrimental, by impairing cellular-level quality. For example, although animals of larger species generally live longer, smaller individuals of the same specie live longer than the larger ones in the absence of predation.

This is also true for women. But, women have proportionally less white matter and more grey matter than men, which compensates for shorter distances. And I think subcortical differences are even more important: lower testosterone and higher oxytocin makes women more sensitive to their immediate environment, especially social. They’re better at bottom-up perception, but are less free in top-down selection. Women do seem to have better integrity, but within a narrower range of interests.


On another note, IQ tests are inherently incapable of capturing higher generalization ability because they are time-limited. The tests are supposed to be background-neutral, except for verbal and math IQ. Thus, they can only measure our ability to discover patterns within data given to a subject during relatively brief test. That means they’re biased toward the speed of learning, and sparse & slow subjects will be at disadvantage. This is effectively confirmed by the finding that lobotomy, a procedure that disables prefrontal cortex (the seat of the highest generalization levels), has little or no impact on IQ.

The same bias is built into any educational system: the detail-oriented "dense" subjects are better at passive knowledge acquisition. "Sparse" architecture is an advantage for independent research and critical thinking, but those are far more difficult to evaluate. Also, modern science has already accumulated a very substantial body of knowledge, which must be passively acquired before one can make a novel contribution. That's a major problem for a generalist. Einstein’s observation that “imagination is more important than knowledge” may no longer hold in established fields (not mine).

There's been a lot of talk about association between "genius" and autism, which I think is misleading for two reasons. First, the diagnosis of autism includes asocial behavior, which is irrelevant: anyone with unusual interests will be correspondingly "asocial". It also includes avoidance of novelty, which is emotionally overwhelming for an autist. But a detached generalist would also avoid novelty, for the opposite reason: it is likely to be a trivial distraction relative to his own thoughts.

Second, it is far easier to recognize exceptional abilities of a specialist than those of a generalist. We all share lower generality levels, - that's where the data comes from. But effective generality of top association cortices definitely differs among individuals, and it takes an equally competent generalist to evaluate quality of generalizations. I think that’s partly why quality of work in psycho-social sciences, and especially in philosophy, is so vastly inferior to that in more precise (lossless) "hard" sciences. So, an autistic genius is far more likely gain societal recognition than an “anti-autistic” one. 

Needless to say, my research of this subject is motivated by introspection.

Cultivating top-down focus


Sustained top-down attention is a must for any theoretical work. Such ability is scarce, we evolved to focus on here and now survival, relegating far and future to the back of the mind. Modern society is drastically more secure, but our attention spans haven’t changed. A lot of people could become world-changing geniuses, if they spent 10 years of their youth fully focused on important problem.
“I have no special talent, I am only passionately curious.” Albert Einstein. But such focus must come at the cost of “life“: unthinkable for ADHD- addled hunter-gatherers that we still are.

Attention span as discussed here is not a simple duration of focus on a given subject. Rather, it’s a top-down vs. bottom-up selection of such subjects. The top is long-term priorities, derived from broad generalizations, and the bottom is current experience. I first decided on my top priority in adolescence. But maintaining effective working focus on abstractions, vs. “real” distractions, was far more difficult. “Thinking is to people as swimming is to cats: they can do it but prefer not to.” Daniel Kahneman.
Over the years, I majorly improved my concentration via following techniques:


P
ractice, externalization, and formalization:


Anything profound is initially boring, only the superficial instantly appeals to an animal within us. Thus, interests must deepen with incremental practice, which cultivates curiosity about the subject. Practice also forms redundant representations, differentiated by their context to explore alternative scenarios. Such redundancy helps to maintain parallel subconsciously searching threads, even when your consciousness is distracted. It also fills-up memory & starves unrelated subjects out of resources. This is very important: irrelevant memories keep competing for our attention until they faint out.

Obsessed with externalities, we need a conducive environment to facilitate virtuous cycle of practicing. Our basic working environment is a notepad or a computer screen, so we need to fill them with a well designed write-up of the subject. Quite obviously, the brain has plenty of memory for a few pages of text, scarce resource here is our attention. Writing down thoughts turns them into a sensory feedback, which is far more effective at attracting conscious attention than “internal” abstractions. Also helps a motor feedback: verbalizing, writing by hand, semi-random editing or re-arranging of code or text.

Formalization starts with developing subject-specific language: concise
and unambiguous terminology, abbreviations, symbols. Such language is critical for building explicit and comprehensive model of a subject, small and structured enough to reverberate within one’s working memory. A write-up of such model must be incrementally refined and extended, - nothing worthwhile can be done on the first try. Theoretical work is all about maximizing integrity (compression) of a model. Far too many researchers simply accumulate vaguely related POVs, without resolving contradictions and overlaps among them.


Stimulation and avoiding distractions:



We are social creatures and our most important “environment and stimulants” is people we deal with.
Hence the urge to bounce our ideas and decisions off others: it makes us focus on their implications. Your listener's (credible) attention stimulates yours, even if he doesn't contribute anything. To facilitate relevant stimulation, universities and companies impose face-to-face contact among colleagues. But relevance of institutions themselves depends on societal consumer competence, which is sorely lacking on higher-generality subjects. And social stimulation can be replaced by writing or talking to oneself.

Absent relevant stimulation (be honest about “relevant“), one must block the irrelevant one. Real-life socializing is almost always meaningless, compared to impersonal reading and writing. People are desperate to join a group and rejection feels like a death sentence. But if there is no sufficiently relevant group, any socializing is huge waste of mind space. However miserable social isolation is at first, you will get used to it. For a broadly stimulated brain with a clear purpose, attention is a zero-sum game.

Such broad stimulation is easy: tea, cocoa, and low-dose nicotine (patch) do it for me. As distinct from smoking, nicotine itself is pretty benign, see Gwern. For less intrinsically stimulated, there are ritalin, adderall, deprenyl, modafinil, etc. Another potent stimulant is exercise while working. I work on a treadmill desk and alternate between walking, standing, and sitting, all while remaining in front of projector screen (which is more “immersive” and distant than a monitor: it doesn’t jump in the eyes as much when you walk). Definitely recommend, it probably added ~2 hours of work per day.

Beside socializing, the worst attention hog now is the web, and my solution is rationing. Unless there is something urgent or work-related (unlikely), I only connect for ~2 hours a day. Sticking to it was a challenge, I have to use “Freedom“ to keep myself honest. Sounds trivial, but it made a huge difference to my concentration. And don’t even start me about current cellphone plague, - never wanted one.


Direct self- conditioning:


But even more insidious, at least for a generalist like me, are internal distractions: wandering thoughts. There is a low-tech solution: thought conditioning. Negative conditioning is simple and old-fashioned: just slap your face when you catch yourself thinking about the irrelevant. Or, more subtly, repeat a mantra: “this is not real” or “it doesn’t matter”. Eventually, irrelevant subjects will acquire unpleasant associations and you stop thinking about them.
The very habit of constantly monitoring your thoughts for distractions (mindfulness) helps to terminate them.

Positive conditioning of relevant thoughts is far more difficult: they are fluid and don’t associate with specific clues (the target of conventional reinforcement). Less specific but still helpful is behavioral conditioning: reserving specific desk, computer, and times of the day only for work. Such cognitive behavioral therapy is useful for all self-control problems. You may even want to lock yourself in for a pre-determined time: just put the key in the "kitchen safe":).


Another form of indirect subject conditioning is neurofeedback, article. I currently use, with moderate success, a very simple version: writing down the number of hours spent on work every day, translating total number of hours spent into most effective 1/3 out of recent working hours.
More advanced neurofeedback may become possible in relatively near future by visualizing subject-associated cortical activity via transcranial imaging, such as EEG or infrared spectroscopy.

Eventually, we will directly stimulate cortical areas that represent relevant subjects via transcranial stimulation or implants. Stimulation by by red and infrared light is already feasible, though not very precise. Overall, top-down attention should be improved by stimulating left dorsolateral prefrontal cortex, which represents highest levels of task-specific generalization. But specific symbolic and mathematical problems seem to be processed in left inferior parietal cortex, especially angular gyrus.


Deliberate control over the subject of attention will be the most profound revolution yet: it will change what we want out of life. But, waiting for the technology will leave you hopelessly behind those who do it old-fashioned way. Of course, most of us dressed-up apes don’t care, - there are bananas to be picked.

Introspection



I am what I do: strictly functional design of cognitive process: www.cognitivealgorithm.info. Good concentration is relatively recent, but I’ve been working on a theory of intelligence most of my life, anything else is trivial by comparison. I work on my own because nothing I’ve come across is coherent enough. Current Machine Learning is experimental field, with little interest in theoretical work.

And also because I can, recently financially and always emotionally. Not that I am the smartest guy in the world, but it‘s not what you got, it‘s how you use it. Lacking nearly universal dependence on social support and superfluous experimentation, I am free to pursue intellectual imperatives.
The older I get (chronologically 54, biologically late 30s), the more it hits me just how abnormal I am: an extreme generalist, with attention span = forever and ridiculous amount of emotional detachment. That is probably my innate cortical bias.

My first interests were geography and history (basic big-picture fields), then physical sciences and biology. I majored in social science because society has the deepest structured complexity. But that field always lacked in academic integrity. And the most important part of social progress is discovery and invention, which is basically a composite of individual human learning. So I switched to studying the latter, about a lifetime ago, both for intellectual depth and for potential impact on the world.

That doesn’t mean psychology and neuroscience. I got into both more recently, mostly for insight into our deficiencies. To understand intrinsic function of cognition, vs. tons of other things in human mind, I think sustained introspective generalization is far superior to observation. Having started with the former, I find almost everything about brain and neurons to be grossly sub-optimal. Which is not surprising for a product of blind evolution and severe biological constraints.

Formalizing cognition is also the only legitimate problem in philosophy, which was my interest for a while. But philosophers are too busy bullshitting college freshmen and other clueless highbrows, they don’t seem to have much time or motivation for real work.

You may have noticed that my interests are missing math and computer science. These are primarily deductive fields: solutions looking for a problem, and causing a man-with-hammer syndrome. I start with induction to define intelligence, then derive operations from that definition. The latter is not terribly controversial, but consistent derivation makes almost all math that I came across irrelevant.

My impression is that people who like math do so for its clarity and certainty, at least initially. But there is a direct tradeoff between certainty and complexity of the subject, which don’t get more complex than human intelligence. I picked complexity and speculation first, certainty had to wait.

Cognitive algorithm must be designed with incremental complexity. Complex math is useful on higher levels of generalization, but is too expensive initially. And even a relatively low-complexity core algorithm should be able to learn increasingly complex computational short-cuts (AKA math) on its own, just like we do. Things like calculus and such are certainly not innate in humans.

Also missing here is anything biographic. Mine is a life of mind, the rest is a distraction.
Throughout history, working alone on my problem would have no real consequence. Things changed: publish on the net now and Google will find you with the right keywords, status and credentials be damned. And convincing people is not even necessary anymore, all you really need is a working code.

Still, a constructive conversation would be nice for now, seeing that I am short of the former.
Anything I write is extremely speculative, to be original. But that’s not an end in itself, understanding my subject is. I never stop questioning assumptions and all my posts are a work in progress.

7/7/12

Consciousness as an artifact of brain-to-body bottleneck



Conscious attention, implemented as working memory, is focused on a single or few items at a time. Intrinsically, cognition doesn't need such central focus: our brains are massively parallel. Ideally, the pool of neurons should be allocated to many subjects of interest by something like a market, according to predictive value of these subjects. As it probably happens in unconscious or intuitive cognition.

But the brain evolved to guide a single body, which in most respects can only do one thing at a time. Hence the artifact of central consciousness. Beside disrupting smooth allocation of cognitive resources, this bottleneck obviously favors somatic concerns, which constitute lower forms of human motivation.

Such sequential focus is likely implemented by thalamus, which serves as a central switchboard for the brain. It seems to invoke consciousness by generating higher-frequency brainwaves, especially gamma waves, which bind together areas related to working memory (brief overview: The Missing Moment by Robert Pollack, pp. 46-56, or a far more involved treatment: Rhythms of the Brain by Gyorgy Buzsaki.

My personal opinion is that main function of thalamus is to mediate competition between brain areas, particularly via TRN. From a networking perspective, it’s a lot cheaper to do this in a central body, as opposed to each region or column directly inhibiting all others. In fact, Sherman and Guillery suggest that a thalamus could be viewed as a consolidated “7th layer” of neocortex (“Exploring the Thalamus”).

Primary sensory and motor cortices seem to be overrepresented in thalamus, - pulvinar nuclei alone comprise 40% of it. Better thalamic connectivity of primary cortices should enhance search for relevant associations in other brain areas. This is introspectively plausible: most of working memory is what we currently visualize, vocalize, or actualize. I think we enhance our focus on general concepts in the same fashion: by generating fake experiences of subvocalizing, subvisualizing, and subactualizing.

This is probably mediated by feedback to primary cortices, underutilized during sensory “vacations”.
So, primary cortices are frequently “hijacked” by higher areas to simulate (interactively project) their generalized concepts. Such “primarization” is particularly important for mathematicians, engineers, and scientists, who work on imaginary constructs and often think visually rather than verbally.

However, primary cortices are unnaturally “low” for such subjects. Because “elevation” is wrong, these projections often become false memories, confabulations, hallucinations, - substitution of imagination (feedback) for actual experience (feedforward). This may be a factor in developing schizophrenia, in which imagination seems to get out of control. It is suggestive that default mode network, and specifically left posterior cingulate cortex, were found to be unusually active in schizophrenics.

Such confusion should be more likely in habitually hijacked primary areas, which may become less attached to their respective senses. More general concepts are represented by higher association cortices. The highest area seems to be dorsolateral PFC: developmentally the last to myelinate and the most involved in executive function. Primarization could be mediated by short-cuts to lower levels of cortical hierarchy, such as arcuate fasciculus and spindle neurons, with their far-reaching axons.



Basal ganglia: subcortical modulator of attention.



While thalamus seems to be a relatively neutral mediator of competition for conscious attention, basal ganglia implements conditioning, which actively directs focus. Phasic dopamine in basal ganglia also indicates “reward prediction error”, and variation in sensitivity to dopamine is a risk factor for ADHD.

For example, ADHD is correlated with 7-repeat allele of DRD4, which accelerates reuptake. Even more important might be variation in COMT gene: Met 158 allele, which degrades postsynaptic dopamine 4x slower than Val 158 allele, is associated with better working memory, but slower attention switching. Basically, it enhances top-down or goal-directed attention vs. bottom-up or novelty-oriented attention. ADHD is treated by norepinephrine and dopamine agonists or reuptake inhibitors, such as Bupropion.

This differs between hemispheres: "To advance our understanding of ADHD and medication effects we draw upon the evidence for (1) a neurotransmitter imbalance between norepinephrine and dopamine in attention-deficit hyperactivity disorder and (2) an asymmetric neural control system that links the dopaminergic pathways to left hemispheric processing and links the noradrenergic pathways to right hemispheric processing. It appears that attention-deficit hyperactivity disorder may involve a bi-hemispheric dysfunction characterized by reduced dopaminergic and excessive noradrenergic functioning. In turn, favorable medication effects may be mediated by restoration in neurotransmitter balance and by increased control over the allocation of attentional resources between hemispheres".

On a cellular level, temporal attention span is inversely proportional to the "decay rate" for stimuli propagating from primary into association areas of neocortex. Passive decay is caused by charge dissipation across neuronal membrane and reuptake of excitatory neurotransmitters at the synapses. Such decay promotes relatively novel stimuli. On the other hand, active suppression by neurons that represent competing stimuli, via inhibitory interneurons and neurotransmitters, should promote relatively recurrent or concurrent stimuli. Longer term, slower passive decay would correspond to longer connections and competition among more distant and persistent stimuli.

Other factors affecting stimuli decay rate are axonal straightness and myelination, structural trade-offs within cortical minicolumns and thalamus (see “Cortical Trade-Offs“ post), and so on. A developmental possibility is that high levels of cortisol / low levels of serotonin increase the levels of phasic dopamine, which in turn accelerates dopamine reuptake. ADHD sufferers have fewer dopamine autoreceptors, leading to greater fluctuations in its levels and increased novelty seeking to keep the cortex busy.

The degree of preference for novelty in the immediate environment also depends on recent intensity of value-loaded stimuli, modulated by our subjective sensitivity to the latter. Sensitivity is increased by deprivation (vs addiction) for positive stimuli, and security (vs vulnerability) for the negative ones. Particularly during formative years, attention span can be increased by broad intellectual exposure, if combined with weak visceral pressures and temptations.


1/7/12

Comments from "Cognitive Focus" knol

 

Derek Zahn on my personal knol:


>"Given limited resources, there must be a trade-off between the number & the length of connections in such network..."
>I don't understand what you mean by "length" here... It seems that the topology etc of the network would be the important properties, not physical measurements...

Here I assume that innate topology of neocortex is roughly the same, - genetic variation among individuals is very minor. On the other hand, the “dense vs sparse” bias (genetic or perinatal) requires very little information, see “Developmental factors” section. Of course, adult topology is largely acquired, but acquisition process itself is affected by innate biases.

By “length” (of axons) I mean average distance between connected nodes (minicolumns?), in whatever topology. Given a fixed total length of connections (resources), greater average length of individual one-to-one connection must come at the cost of smaller total number of these connections. Think of spindle neurons, - very few but very long connections. So, this would produce sparser network with longer-range & more selective associations (concepts). Selection itself is probably through some variation of Hebbian “fire together- wire together”.

Let’s face it, the brain is physical, its resources are limited, there are trade-offs to be made. Again, this knol is on gross neural bias only, I deal with algorithmic level (not necessarily neuromorphic) on my “Intelligence” knol.

Todor Armaudov:

Cognitive abilities peaks. Saturation of learning and Generalization novelty seeking.

Hi Boris,

I'm not ready to discuss on neurological stuff, but I could on the years of peak of cognitive abilities.

I think some of the abilities are "flat" or at least could be "emulated" without deep generalization, and their peak might be more likely dependent on social status, aim at power and focus rather than special generalization shift. Science has also a social status bug, because usually researchers are supposed and they do accept to serve their master's directions until 30s (PhD, post-doc ...)

Language and stories maybe don't have that deep hierarchy or so, I don't know, but I think gifted writers and poets may reach to high or "perfect" skills as early as their 20s or even teenage years. ("Perfect" means there's not much more where to go in style and how to tell a story interestingly.)

This can happen even without reading lots of sample fiction. Acknowledgement or time-span needed to write influential works may take decades, though.

Also, you call art "fluff", but I believe talent to write stories includes a good deal of generalization.
I think art is an imitation of algorithms; the worst authors copy data, the talented and original ones induce and understand algorithms (patterns) that could generate plausible data, their algorithms are more robust and are harder to reverse-engineer given only the artwork.

(I agree that for writing literature critics, reading lots of books through many years is helpful, though.)

Maybe I'm just an exception, but was authoring pretty high generality stuff at age of 17-18-19 such as philosophy (including "my theory"); [science] fiction and fantasy with philosophical elements; was doing language engineering (lexical and semantic enrichment of Bulgarian), "genre and style" engineering, and solid sociolinguistics research.

It was a peak, and I have explanation why it declined in the following years: saturation & distractors. :)

There's a phenomenon I call exhaustion or saturation of learning. Saturation is not only cognitive (e.g. you don't get social support for the activities and give up), but there is a crucial cognitive part which is related to boredom and the conditions where the cognitive algorithm should skip too predictive patterns. That's a form of novelty seeking, and I think it contributes to shift to higher generality concepts after lower ones are saturated.

When mind extracts patterns from a given domain (set of raw data/patterns), initially it does fast and improves quickly. This can be either at same level of generalization (reaching high predictability & precision) and multi-level - increasingly abstract generalizations are discovered. However the process slows down in both directions, eventually at the highest level of generalization discovered. Mind cannot find higher level of generalization, gets bored and tends to switch to new domains in order to find:

- more unpredictable/complex patterns, starting from lowest level
- steeper function of generality increase (until another saturation)

I'd call this "Generalization novelty seeking"

I suspect persons who have higher tendency to search for inter-domain generality and the fast learners don't freeze in one single domain because they feel such saturation of generalization.

The general knowledge gotten is reused between domains and makes learning of new domains faster. Eventually domains run-out and merge, and this is accelerated by the inter-domain generalizations showing that different things are the same thing with different names.

After inter-domain saturation mind has no choice but to concentrate on higher concepts from the now merged domains, that seemed saturated before, and try to generalize further. Otherwise it would just be bored to death... :)

The not-that-inter-domain learners tend to focus to make one or a few narrow domains "perfect". They don't care or don't notice that most of the time the progress is very slow or none, they're reaching precision and generalization limits and doing the same thing over and over again with no improve.


Last edited Jul 2, 2010 5:06 PM
DeleteBlock this userReport abusive comment
> I think some of the abilities are "flat" or at least could be "emulated" without deep generalization, and their peak might be more likely dependent on social status, aim at power and focus rather than special generalization shift. Science has also a social status bug, because usually researchers are supposed and they do accept to serve their master's directions until 30s (PhD, post-doc ...)

As I mentioned elsewhere, I try to focus on cognitive factors.

> Language and stories maybe don't have that deep hierarchy or so, I don't know, but I think gifted writers and poets may reach to high or "perfect" skills as early as their 20s or even teenage years. ("Perfect" means there's not much more where to go in style and how to tell a story interestingly.)

Poets, more likely than novelists (form vs. content). Just because they all write doesn’t mean it on the same level of generalization. Anyway, I’d rather not discuss art.

> Maybe I'm just an exception, but was authoring pretty high generality stuff at age of 17-18-19 such as philosophy (including "my theory"); [science] fiction and fantasy with philosophical elements; was doing language engineering (lexical and semantic enrichment of Bulgarian), "genre and style" engineering, and solid sociolinguistics research.
There's a phenomenon I call exhaustion or saturation of learning. Saturation is not only cognitive (e.g. you don't get social support for the activities and give up), but there is a crucial cognitive part which is related to boredom and the conditions where the cognitive algorithm should skip too predictive patterns.

I was talking about the age of highest achievement. Just because you had energy, ambition, & did some work doesn’t mean you achieved much. My experience with your writing (including this comment) suggests that you’re after quantity rather than quality. It seems like a “specialist bias” to me, even as you’re trying to generalize. I think you got bored because you *did not* find any predictive patterns, had no patience to continue, & nobody paid any attention.

> That's a form of novelty seeking, and I think it contributes to shift to higher generality concepts after lower ones are saturated.
That’s a form of parroting, with wrong conclusions, & on the wrong knol.

> When mind extracts patterns from a given domain (set of raw data/patterns), initially it does fast and improves quickly. This can be either at same level of generalization (reaching high predictability & precision) and multi-level - increasingly abstract generalizations are discovered. However the process slows down in both directions, eventually at the highest level of generalization discovered. Mind cannot find higher level of generalization, gets bored and tends to switch to new domains in order to find:

You didn’t explain why discontinuous search (jumping domains) would speed up generalization. “Boredom”, “saturation” are great pop-psych terms to *obscure* the subject.

> - more unpredictable/complex patterns, starting from lowest level
>- steeper function of generality increase (until another saturation)....

Confused. It’s ironic that I, a social science major who hasn’t even *seen* a computer till 22 yo, got into formalizing bottom-up pattern discovery when younger than you are. You started programming at, what, 10 yo? This discussion belongs on the “cognitive algorithm” knol, but please don’t comment till you can suggest quantifiable criteria. I really don’t need more distractions.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 3, 2010 2:00 AM
I'm leaving you and myself alone with my abusive nonsenses, if you wish delete my comments,
but sorry - your overgeneralized offensive nonsenses are wrong.

>I was talking about the age of highest achievement. Just because you had energy, ambition, & did some work doesn’t mean you achieved much. My experience because you had energy, ambition, & did some work doesn’t mean you achieved much. My experience
>with your writing (including this comment) suggests that you’re after quantity rather than quality. It seems like a “specialist bias” to me, even as you’re
>trying to generalize. I think you got bored because you *did not* find any predictive patterns, had no patience to continue, & nobody paid any attention.

I engaged a professional die-hard 40+ yo philosopher and philosophy writer with opposite POV to discuss in many long letters with me, while I had read a few high school textbooks and magically inducing "working" philosophy from my mind, that a "master" like him couldn't "defeat".

Inventions from my linguistics/sociolinguistics were cited in at least two scientific papers (I wouldn't join the "mainstream" though, because I didn't have education and was criticizing their bad terminology). I befriended a leading sociolinguist (40+ yo), had a comprehensive and solid conclusion work called "The Decline of the language of Bulgarian society".

My electronic dictionary with enriched Bulgarian has thousands of downloads and keep counting; was uploaded at download space and published in a magazine by somebody else's promotion. I was interviewed for another magazine before that, for newspapers, radio and surprisingly to you I *denied* a request for a TV interview, when I was already bored of understanding and knew this won't help to achieve what my research aimed at and what it concluded.

>got into formalizing bottom-up pattern discovery when younger than you are

Because I've never really tried to formalize it, I guess.

BTW, I can speak about the language of social scientists and you apparently don't sound as a typical SS graduate. In Bulgarian it seemed like a trivial common-sense descriptive blah-blah, I was disappointed it can have so low underlying complexity (I was a high school student in a *technical* school and could understand and explain that stuff better than those graduate social idiots). The writings were just masked with a bunch of pointless foreignisms they call "terminology" - to prove it was a "science". I suspect it was the same in USSR.

Bye

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 3, 2010 7:54 PM
Todor, your comments aren’t abusive, just distracting. My replies are often offensive, but that’s how you deal with distractions. If I didn’t think there’s a slight chance you might eventually contribute, I wouldn’t reply at all. My “overgeneralization” is simply a matter of selective focus.

>> I engaged a professional die-hard 40+ yo philosopher and philosophy writer with opposite POV to discuss in many long letters with me, while I had read a few high school textbooks and magically inducing "working" philosophy from my mind, that a "master" like him couldn't "defeat".
...

Not much of an achievement, is it?

>> you apparently don't sound as a typical SS graduate.

Much of my SS background was in the US, that was a lifetime ago, & I am not typical anything.

>>got into formalizing bottom-up pattern discovery when younger than you are
>Because I've never really tried to formalize it, I guess.

There's a reason for that.

Look, I am sorry to keep offending you, but it’s not personal (though you do have a very annoying habit of self-promotion). There’s only one thing worth focusing on, & I don’t care if I offend the rest of humanity to do so.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 3, 2010 8:51 PM
>My “overgeneralization” is simply a matter of selective focus
It's anti-promotion of your methodology - makes wrong predictions.

>Not much of an achievement, is it?

Of course that was about *no body paid any attention*. Content is too long a topic & I'm tired of self-promotion to explain. I hadn't seen anything phil. to really surprise me after these years (or just an year).

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 4, 2010 2:34 PM
> It's anti-promotion of your methodology - makes wrong predictions.

You have a point there, I never heard of a social scientist or a philosopher doing much to formalize cognition. But that may have more to do with dysfunctional nature of social institutions that represent these fields. Yes, just about all writers on the subject have math, CS, EE backgrounds. But that might be because they have tangible accomplishments in their fields, which gives them the confidence to tackle such grand ambitions, & convinces people to pay attention. And these fields are closely related on a basic level, - it's all formal information processing. But the level of complexity is vastly different, so the results aren't impressive. Basicaly, there's no institutionalized field that's fit for the problem. It's like science in the Middle Ages, if you want to do it, you're on your own.

> Content is too long a topic & I'm tired of self-promotion to explain.


There's a difference between self-promotion & writing-up ideas in a coherent form. If you cared enough about the content to actually work on it, just give me a link. As it is, your writings in Bulgarian is just chatter & story-telling, you're not translating them because they're not worth it (& google does it pretty well). It feels strange to keep hearing about "your theory" as if it was some kind of intellectual status symbol. It's all about you & your accomplishments, & next to nothing about the subject matter.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 4, 2010 4:32 PM
>Basicaly, there's no institutionalized field that's fit for the problem. It's like science in the Middle
>Ages, if you want to do it, you're on your own.

Right.

>> Content is too long a topic & I'm tired of self-promotion to explain.
>There's a difference between self-promotion & writing-up ideas in a coherent form. If you cared
>enough about the content to actually work on it, just give me a link. As it is, your writings in
>Bulgarian is just chatter & story-telling, you're not translating them because they're not worth it (&
>google does it pretty well). It feels strange to keep hearing about "your theory" as if it was some kind
>of intellectual status symbol. It's all about you & your accomplishments, & next to nothing about the
>subject matter.

I'm bored to discuss on this too. I guess sometimes it's a defense, like "leave me alone, I'm busy! I'm not ready! I'm not focussed! (and never had been)"

While you are focused on the only most significant etc. thing, I've been busy and focused on *many* most significant things a day.

Yes - I don't think it's worth the time translation in that form and no one would read that long s*, I apparently have more important things to do & should compress it to next to nothing or something.

However your definition is again overgeneralized, because as a philosophy my s* was fine. I wouldn't be here if it was complete junk & whatever more I say, would be self-promotion. I'm sick & tired of this, want to be constructive, sorry for self-promotion and the spam.

One last spam: a question from a student. He said he was amused by your sentence "you need boring life" and others, and asked:

- Is Boris' theory falsifiable? Where his confidence comes from?

Funny, isn't it. I didn't have an answer. I've asked you in the past also, you answered "implementation is trivial, once you have a formal theory". Great, and what if you work 40 years to find your formal theory was wrong.

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 5, 2010 4:15 PM
> Is Boris' theory falsifiable?

Ah, yes, Popperian philosophy. It’s wrong, both corroboration & falsification are a matter of degree. Think of it in Bayesian terms: facts don’t prove or disprove empirical theory, they simply increase or decrease its predictive value. In my approach, the empirical part is the definition of intelligence, & “falsifying” means finding some essential function of intelligence (in common-sense terms) that it doesn’t cover. Then I would have to generalize the definition, but that’s what I am doing anyway. The rest of my theory is deductions from the definition, where the test is not facts but internal consistency (as in math).

> Where his confidence comes from?

Initially, probably my mother (she is pretty unique, very high oxytocin & serotonin to cortisol ratio:)). To develop long attention span, you need to have confidence to begin with, otherwise you’re stuck with 4Fs & computer games. That’s why I am skeptical about engineers, & then hard vs. soft science types. They can’t sustain their curiosity without short-term feedback, - tests, proofs, experiments, action. Intellectual insecurity.
But of course that’s just a start, confidence needs to be constantly reinforced. So, yes, successful engineers may develop confidence that affords them longer no-feedback attention span, & gain more of a “generalist bias”. But I don’t know if that can reverse early development, - the brain is not that plastic anymore. For me, the confidence is reinforced by the simple fact that I understand the issues better than anyone I’ve heard of. Kind of like you with your philosophers :).

> Great, and what if you work 40 years to find your formal theory was wrong.

I would spend 40,000 years if I had them, - nothing else is worth doing. It’s not a matter of right & wrong, it’s about making progress, faster than anyone else. Stop thinking in holistic terms, there’s no immutable “theory”. I generalize a problem & deduce solutions, it’s an incremental process. I am smarter than evolution & don’t need no stupid trial & error outside of my head. That’s where intelligence is, right?

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 5, 2010 11:05 PM
Thanks for the answers!

>That’s why I am skeptical about engineers, & then hard vs. soft science types. They can’t sustain
> their curiosity without short-term feedback, - tests, proofs, experiments, action. Intellectual
> insecurity.

I also don't like too pure hard science types if that means they're too formal, too “mathematical” and focused only in details, but to me the best scientists are interdisciplinary hybrids.

>>Where his confidence comes from?
>Initially, probably my mother (she is pretty unique, very high oxytocin & serotonin to cortisol
> ratio:))

Indeed, you've once said "love is a stupid addiction". I agreed then, but recently felt ashamed of this and reconsidered “stupid”. To you everything is stupid and a waste of time, but the causes of the start of the neurotransmitter addictive cycle in real love are cognitive: matches and predictions, she's like you've wanted her to be, like made for you. Love at first sight is possible (empirically proven), also one can fall in love before puberty.

Addiction is with a reason, because while the beginning and the end of love are destructive to mind, the middle is stable and puts neurotransmitters and hormones in a comfort state (this is a desired state, addiction helps to keep you there); encourages focus on the most significant one, instead of doing novelty seeking; there's a dedicated long-lasting intellectual partner to discuss with and not the least, there's plenty of oxytocin in the neurotransmitter soup.


>But of course that’s just a start, confidence needs to be constantly reinforced. So, yes,
> successful engineers may develop confidence that affords them longer no-feedback attention
> span, & gain more of a “generalist bias”. But I don’t know if that can reverse early development, -
> the brain is not that plastic anymore. For me, the confidence is reinforced by the simple fact that I
> understand the issues better than anyone I’ve heard of. Kind of like you with your philosophers :).

Regarding the ages of peaks, if not mistaken Marx & Engels did the initial core of their work in their 20s (1840s). OK ,"it's not much". :)

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 8, 2010 8:41 AM
> the best scientists are interdisciplinary hybrids

Not necessarily as scientists, definitely not now. One of the reasons people go into hard sciences is the promise of certainty, something you don't get when you go into meta-science.

> Indeed, you've once said "love is a stupid addiction"

Damn, I gave you an excuse to go off about love again. Addiction is when you focus on things you shouldn't be focusing on, even if that puts you in a "comfort state".

> Marx & Engels did the initial core of their work in their 20s (1840s). OK ,"it's not much". :)

Right. Ideology is really a form of art: the purpose is to make an impression, not to make sense. But this is not an excuse for you to go off about art:).

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 8, 2010 11:26 PM
>> Indeed, you've once said "love is a stupid addiction"
>Damn, I gave you an excuse to go off about love again. Addiction is when you focus on things you shouldn't be focusing on, even if that puts you in a
>"comfort state".

But by putting you in a "comfort state", some addictions might be indirectly helpful for other purposes for ones without hormonal or neurotransmitter genetic advantages.

>> Marx & Engels did the initial core of their work in their 20s (1840s). OK ,"it's not much". :)
>Right. Ideology is really a form of art: the purpose is to make an
>impression, not to make sense. But this is not an excuse for you to go
>off about art:).

Communism is an ideology, but Dialectical Materialism is a philosophy.
It's quite general in definitions, but if not more, DM at least notices important aspects such as mind imitating its inputs and builds itself on them, emergent behavior; evolution, hierarchical organization of matter and transition to higher levels of matter with preserving "good" from the past; from specific to general/abstract etc.

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 9, 2010 6:17 AM
> But by putting you in a "comfort state", some addictions might be indirectly helpful for other purposes for ones without hormonal or neurotransmitter genetic advantages.

Good luck with that, but I don't think she'll leave you in that state long enough. Women are practical people, they have their priorities & stimulating you abstract thoughts not likely to be one of them (although they're great at faking common interests at first).

> Communism is an ideology, but Dialectical Materialism is a philosophy

DM was afterthought for Marx (he never actually used the term), mostly borrowed Positivism, fashionable at the time. He was a "philosophical" rabble-rouser at heart. "Dialectical" part is meaningless, & "Materialism" is just an excuse to trash religion (not that there's anything wrong with that).

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 9, 2010 9:17 PM
Thanks... Actually "love's no friend" of mine, I've been trying to kill it because it has been killing me, but I don't manage yet - except in artworks. :) She's a deep character though, plot-twists are possible.

This is about romantic love, but I guess the other kinds - friendship, friendliness, empathy and socialization in general are providing healthy neurotransmitters and hormones as well (a speculation). It's an "anti-reclusive health strategy" if your brain fails to generate the right chemicals while feeling lonely.

EDIT: BTW, actually after reading a bit on the topic of oxytocin, I suspect I may have high oxycotin as well, even when lonely or in love unrequittedly. Maybe that's why I don't succeed in killing love and keep falling in.

>DM was afterthought for Marx (he never actually used the term), mostly borrowed Positivism,
>fashionable at the time.

OK, I guess it's more of Lenin than Marx.

>He was a "philosophical" rabble-rouser at heart. "Dialectical" part is meaningless, & "Materialism" is
>just an excuse to trash religion (not that there's anything wrong with that).

I do see low-complexity, obvious definitions and plays with words. I guess one of the basics, the never ending fight between two opposites/contradictions can be derived from the minimal possible number of different elements: 2 + the basic DM assumption of never-ending motion/change (and maybe the ideology assumption of class struggle).

Ah, and another question from Georgi. I've told him what you've told me you're doing for a living and why.

He asked:

Georgi: Isn't it a dangerous job? What's going to happen to your advanced yet unintelligible-by-others work if an accident happens or so? It might be lost for the world. [Do you care?]

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 11, 2010 2:56 PM
I said my mother must've had high oxytocin, definitely not me. Opposite effect. Anyway, all that neurobabble is pointless, use your common sense. Kicking the "life" habit is like kicking any other habit. It's hard at first, but if you stick to it, *&* work semi-productively, the work will take over as the main habit.
Lose life, move to the countryside or something. Try it, go on vacation, anything else is just an excuse.
The best confidence is the one you gain by doing real work.

Appreciate the concern, but my job is not dangerous at all, & only takes ~1 hour of my time per shift. But even that is excessive, I seriously consider quitting it.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 11, 2010 8:01 PM
>I said my mother must've had high oxytocin, definitely not me. Opposite effect.

OK - I presumed oxytocin effects might be linked to your health and confidence.

>Appreciate the concern, but my job is not dangerous at all, & only takes ~1 hour of my time per
>shift. But even that is excessive, I seriously consider quitting it.

:)

>Lose life, move to the countryside or something. Try it, go on vacation

That's a good idea, I've been considering it (in the mountains) and may try it this summer, but it probably would be short. It may be a silly excuse to you, but I couldn't sustain long a living with my savings and current scarce earnings.

You give prizes, but it's risky because reaching there may take me too long and I may fail.
In order to relax, I need a back-up plan and financial security... :-|

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 12, 2010 3:13 PM
> OK - I presumed oxytocin effects might be linked to your health and confidence.

No, that's probably early serotonin exposure (http://www.raysahelian.com/serotonin.html.) Serotonin is upstream from oxytocin & far more general (oxytocin is specific to social interactions). That pop-sci article on your blog totally ignores it, probably because the effects are too complex & not as flashy. The "peace of mind" effect is receptor-specific & not well understood. My point is, you can have a peace of mind without going through all the nonsense of social interaction :).
Another thing that article totally misinterpreted is the fact the oxytocin is anti-addictive.
What that means is, prior exposure to oxytocin will make you crave social support less, not more, so you may get less social :).

> You give prizes, but it's risky because reaching there may take me too long and I may fail. In order to relax, I need a back-up plan and financial security... :-|

Of course it risky, anything worthwhile is. But consider the alternative: you'll never make a difference. You're so far behind, any delay means you may never catch up. Are you willing to take that risk? Relaxing is more a matter of lifestyle & immediate environment than financial security (heck, you can grow you own potatoes :)).
Anyway, I sent you a loan to be paid by a future prize, just to show that I am serious.
You owe me an insight :).

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 12, 2010 10:54 PM
>> OK - I presumed oxytocin effects might be linked to your health and confidence.
>No, that's probably early serotonin exposure
>(http://www.raysahelian.com/serotonin.html.) Serotonin is upstream from
>oxytocin & far more general (oxytocin is specific to social interactions).

OK (social interactions - "animate objects"... )

>That pop-sci article on your blog totally ignores it, probably because the
>effects are too complex & not as flashy. The "peace of mind" effect is
>receptor-specific & not well understood.

Thanks for reading!

>My point is, you can have a peace of mind without going through all the
>nonsense of social interaction :).

Right - ascetics, monks...

>Another thing that article totally misinterpreted is the fact the oxytocin
>is anti-addictive. What that means is, prior exposure to oxytocin will make
>you crave social support less, not more, so you may get less social :).

I think there are clues for this from life: picking-up a girlfriend and "real love" often causes less interest in socializing with other people, but The One; also - loosing touch with friends.

>Of course it risky, anything worthwhile is. But consider the alternative:
>you'll never make a difference. You're so far behind, any delay means you
>may never catch up. Are you willing to take that risk? Relaxing is more a
>matter of lifestyle & immediate environment than financial security (heck,
>you can grow you own potatoes :)).
>Anyway, I sent you a loan to be paid by a future prize, just to show that I
>am serious.
>You owe me an insight :).

Thank you for your generosity and expectations! :)
Hope to deserve the loan... I just couldn't do too radical things immediately and for too long.

Collaboration may be found, I already met a core of smart and attracted students, who are willing to keep the communication and continue discussions out of class. I plan to have open lectures and another more advanced and focused course or two in the University next year, it may include lessons on generalizing (hope to progress by then), and brain-storming/generalizing real problems together.

Regarding love, it's being distracting and hurting, but it may happen to be helpful for concentration anyway - if it fails to be because of oxytocin, it may succeed by turning into action my maxim: "I'm most inspired when I'm most despaired"...

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 13, 2010 4:49 PM
> "I'm most inspired when I'm most despaired"...

Probably in the wrong direction, desperation shrink attention span. Good read: http://www.theamericanscholar.org/solitude-and-leadership/

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 15, 2010 12:35 AM
Thanks for the link. I agree with the essay, with some exceptions: would discuss about multi-tasking and social networks, don't think it's Black and White.

>> "I'm most inspired when I'm most despaired"...
>Probably in the wrong direction, desperation shrink attention span.

It's perhaps quite poetic & sentimental, there's a continuation of the maxim: "I'm the most inspired creator" ~ the most despaired, it shouldn't be a regular despair. Some grief and understanding of sort of hopelessness (prediction of inevitably undesirable outcomes) is not a real (chemical) despair, more likely I do Repair and Inspire again.

There might be a "vector" of different inspirations/despiraitions [of ...]. I've got some thoughts that could be related to such a "vector", distractors, what you call ADD bias/search for novel specifics, domain jumping and "threads" in mind (mind as not really integrated system), but I'd think some more.

If not mistaken this direction is related to the 4-th level in your cognitive hierarchy, you've mentioned Economy of cognitive resources or something.

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 16, 2010 5:35 PM
> There might be a "vector" of different inspirations/despiraitions [of ...]. I've got some thoughts that could be related to such a "vector", distractors, what you call ADD bias/search for novel specifics, domain jumping and "threads" in mind (mind as not really integrated system),

Right, desperation might lead to radical change, & you definitely need it, but use every excuse in the universe to avoid.
Novelty seeking has many different aspects, you need to be analytical about it.
In my interpretation, valuable “novelty” is actually an incrementally abstract type of correspondence.

> If not mistaken this direction is related to the 4-th level in your cognitive hierarchy, you've mentioned Economy of cognitive resources or something.

Those levels are way out of date, too coarse & analogical. Resource allocation is what every level does.
My work now is strictly quantitative & incremental, the levels are defined by the type of correspondence they select for. Higher types are recurrent subsets of lower types.
The first four are: magnitude ) matched magnitude ) projected match ) additional projection...
Try to formalize those, on the "cognitive algorithm" knol :).

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 16, 2010 9:19 PM
>desperation might lead to radical change, & you definitely need it,
>but use every excuse in the universe to avoid.

I've been doing lots of things yet and probably would do, but I think I do progress to the right direction. A part of my "distractors" in the last half an year or so and yet have been reading/studying and preparing and conducting the AGI course.

However I'm tired of reading and it won't get work done, I've always preferred thinking on my own, so this time-slice is about to go to active mode.

>Novelty seeking has many different aspects, you need to be analytical about it.

OK.

>In my interpretation, valuable “novelty” is actually an incrementally
>abstract type of correspondence.

I supposed so - "novel generality"; maybe novelty that allows inducing novel generality. I think searching and getting lots of samples may help to find it, though. Having lots of similar patterns in mind promotes compression and generalization.

>> If not mistaken this direction is related to the 4-th level in your
>>cognitive hierarchy, you've mentioned Economy of cognitive resources or
>>something.
>Those levels are way out of date, too coarse & analogical. Resource
>allocation is what every level does.

OK. :)


>My work now is strictly quantitative & incremental, the levels are defined
>by the type of correspondence they select for. Higher types are recurrent
>subsets of lower types.
>The first four are: magnitude ) matched magnitude ) projected match )
>additional projection...
>Try to formalize those, on the "cognitive algorithm" knol :).

Nice, thanks. :) I haven't forgotten the other one, as well; have reflected, but too little, yet.

BTW, what's your line on Ben Goertzel? Trying to cover him, but don't sure for how long I will sustain. To me Schmidhuber's "tune" is better. I like Goertzel as an enthusiastic guru and popularizer, but he seems to be strongly influenced by high-level Cognitive science & NLP... Maybe partially that's because he's quite impatient to sell products immediately, and cognitive architecture style is more socially acceptable/"apparently should be working"/"pop-sci"...

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 18, 2010 12:54 PM
Reading is great, I just can't find anything useful

> maybe novelty that allows inducing novel generality

Right, but that accumulation of data must be increasingly selective. Novelty corresponds to spatial discontinuity in input flow. That’s macro-selection, & syntactic differentiation of past inputs by comparison is micro-selection. Discontinuous input coordinate selection is directed by projection: the value of syntactic re-integration at that coordinate. Now, try to formalize those operations.

> BTW, what's your line on Ben Goertzel? Trying to cover him, but don't sure for how long I will sustain. To me Schmidhuber's "tune" is better.

Geortzel is a social butterfly / tinkerer. His definition of intelligence is meaningless, he has no theory & doesn’t think he needs one, - “it’s an engineering problem”. Schmidhuber is more coherent, but he doesn’t get *incremental*. Mathematicians are trained to deal with complex operations, they think starting simple is beneath them. Yet, without simple incremental steps there’s no scalability. Anyway, I’d rather discuss issues than people.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 18, 2010 8:40 PM
>> maybe novelty that allows inducing novel generality
>Right, but that accumulation of data must be increasingly selective.

And I guess, unlike "normal novelty", this kind of novelty may be found by re-evaluation/focus on very old recorded data, leading to "Eureka!". Generalization is lossy, thus lower-generality records should have more details/features to select from.

>Anyway, I’d rather discuss issues than people.

Fine, I meant Goertzel's work; you did: the issues on "engineering problem", mathematicians' training and incrementability.

>Now, try to formalize those operations.

OK, got tough tasks - will comment in Cognitive algorithm knol when got something to say on...

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 20, 2010 3:50 PM
> And I guess, unlike "normal novelty", this kind of novelty may be found by re-evaluation/focus on very old recorded data,

Again, the oldest data would be on the highest levels, although there can be another hierarchy of storage costs within each level. On the first level the cost also includes default comparison, then it's only storage, from new: RAM, to old: tape. The original form of novely seeking would be to to actually "look" at the new locations. That would require motor feedback, but yes, it's not principally different from feedback within the hierarchy.
So, older inputs should be displaced in FIFO order into cheaper storage, as long as the cost of transfer & storage declines faster than predictive value of the inputs.
I guess I was wrong to dismiss your idea of buffering old inputs.
Congratulations, you won the first prize! (did you have any problems with PayPal?). It’s worth more than $100, but the idea itself is simple, it needs to be justified in terms of costs vs. benefits.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 21, 2010 5:03 AM
>> And I guess, unlike "normal novelty", this kind of novelty may be found by re-evaluation/focus on very old recorded data,
>Again, the oldest data would be on the highest levels, although there can be another hierarchy of storage costs within each level.

I think I see - higher level --> higher range of patterns in space and time.

In this comment I meant something else - a hippocampal-style playback, sort of conscious(?) recall/re-evaluation. Records in memory are built from [sequences] of hierarchical pieces/concepts. When playing back, mind may select pieces at a given level of abstraction from one or many different records, which themselves were selectively recalled in a sequence. More abstract pieces/concepts may be induced from these pieces and the new higher concepts may be recorded back to the old memories.

I.e. mind may do this by introspection on data which are in memory anyway, no need to read or search in the world. (In the context of your line: "Reading is great, I just can't find anything useful")

Also I think it's related to the issue with wise books read too early. E.g. if a little boy reads at school Exupery's "The Little Prince", he's not likely to understand the metaphors and deep meanings, but he may remember the stories literally. Many years later if he recalls them even without re-reading, he might understand their moral. This particular phenomenon might not be "inducing", but "matching" to already understood higher concepts, though.


>On the first level the cost also includes default comparison, then it's only >storage, from new: RAM, to old: tape. The original form of novely seeking >would be to to actually "look" at the new locations. That would require >motor feedback, but yes, it's not principally different from feedback within >the hierarchy. So, older inputs should be displaced in FIFO order into >cheaper storage, as long as the cost of transfer & storage declines faster >than predictive value of the inputs.
>I guess I was wrong to dismiss your idea of buffering old inputs.
>Congratulations, you won the first prize!

Thanks! :)

>It’s worth more than $100, but the idea itself is simple, it needs to be
>justified in terms of costs vs. benefits.

OK :)

>(did you have any problems with PayPal?).

Unfortunately I did - sent you an email.

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 21, 2010 4:49 PM
> In this comment I meant something else - a hippocampal-style playback, sort of conscious(?) recall/re-evaluation...

Damn, you've gone all analogical on me again :).
I understand, you're talking about extended storage within each level of detail | generalization, in case memory origin's location becomes relevant in the future. In my understanding, a level is ordered as a FIFO: proximity = priority (the sequence may include spatial frames, or whatever). New inputs displace the old ones till they get pushed out of the queue: selectively elevated, & deleted as obsolete on the current level.
That still stands, but I realized that this push-out should be multi-stage, - into less expensive memory (if available) intstead of immediate deletion.
The first queue must be short because it's very expensive: all inputs are immediately compared, generating redundant representations (overlaping derivatives). The following stage queues can be much longer because they're cheaper: inputs are stored but not compared unless their location comes into "focus" again.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 21, 2010 10:51 PM
>Damn, you've gone all analogical on me again :).
>I understand, you're talking about extended storage within each level of
>detail | generalization, in case memory origin's location becomes relevant
>in the future.

Nice. :) I haven't thought with these precise terms though (origin's location, I remember you mentioned once in a recent posting about hippocampus in my blog). if you wish and have patience, check out the translation of my old speculations and some new.

Regarding the prize and buffers, longer look-back buffers for recent inputs (seems at any levels for any purpose) also for locations (context) would be an advantage for faster pattern discovery allowing track back if needed, very useful for patterns extended in time ("delayed"), with which my speculations in the knol began.

Anyway, memory was a milestones in my old writings, a starting point, such as a theoretical induction of neocortex and hippocampus (-like modules/effects) based on the evidence from consciousness as biographical memory and the fact that mind does learn and perform increasingly better before & without having such memories in the early age.

I assumed hippocampus-style-memory, or "Events Operating System/Memory" (EOS) is somewhat higher level in mind than "Executive Operating System/Memory" (EXOS, neocortex and patterns there).

EOS effects are more abstract than of "Executive OS" (EXOS, neocortex), because EOS is an add-on to neocortex, I suspect there must be levels of generality and "discretization points" already developed in neocortex, in order the hippocampus-style memory to start working.

Just published a translation of some sections and some new speculations on the decline of neuroplasticity in relation to the hippocampal-style memory (too long to put it here): http://artificial-mind.blogspot.com/2010/06/teenage-theory-of-mind-and-universe.html

...


Regarding your whole comment - indeed I think using FIFOs (either simple and priority queues), fast caches and levels of memory might be universal rules-of-thumb from Engineering.

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 24, 2010 5:08 PM
> Regarding the prize and buffers, longer look-back buffers for recent inputs (seems at any levels for any purpose) also for locations (context) would be an advantage for faster pattern discovery allowing track back if needed, very useful for patterns extended in time ("delayed"), with which my speculations in the knol began.

Yes, I said I was wrong to dismiss it… Oh, you want me to change my reply there? Will do, as soon as I edit the knol itself, it doesn't address that point at all.

> I assumed hippocampus-style-memory, or "Events Operating System/Memory" (EOS) is somewhat higher level in mind than "Executive Operating System/Memory" (EXOS, neocortex and patterns there). EOS effects are more abstract than of "Executive OS" (EXOS, neocortex), because EOS is an add-on to neocortex, I suspect there must be levels of generality and "discretization points" already developed in neocortex, in order the hippocampus-style memory to start working.

This is backwards. Buffering is not higher than anything, there’s no abstraction going on, or any processing for that matter, just simple copying. It’s not higher in scope either, the macro-structure is hierarchy, sequence is a structure within its levels.
As for hippocampus, it’s not an add-on, the neocortex is (Hawkins is dead wrong here). Hippocampus, otherwise known as archecortex, is a primitive 3-layer structure, part of “reptilian brain”. Neocortex developed later, both in phylogeny & in ontogeny. The fact that hippocampus is necessary to form declarative memories is an evolutionary bug (brain is full of those). Ideally, the neocortex + thalamus, maybe striatum, should be doing all the work, the rest of the brain can go extinct.
Look, neuroscience at its current state is just a vague inspiration for understanding intelligence, you need to keep it separate from theoretical work.

> Just published a translation of some sections and some new speculations on the decline of neuroplasticity in relation to the hippocampal-style memory (too long to put it here): http://artificial-mind.blogspot.com/2010/06/teenage-theory-of-mind-and-universe.html

OK, not bad for a teenager, but can we please get on? For example, stop abusing "computerese" terms & acronyms that really just obscure the subject (for yourself). The neuroplasticity stuff sounds random to me.
If you have any ideas you want to discuss *now*, great, but you need to formalize them. Otherwise, your signal-to-noise ratio is too low to bother. Seriously, you *talk* about compression, why not try to practice it?

> Regarding your whole comment - indeed I think using FIFOs (either simple and priority queues), fast caches and levels of memory might be universal rules-of-thumb from Engineering.

You don’t need to know any engineering to understand these things. Yes, there’re plenty of useful ideas in engineering. But, for a strictly incremental approach, selecting the right ones is harder than deducing them from the first principles (kind of like picking good ideas from your writing). The economics change as you get into advanced math & engineering, but these should not be necessary for a basic learning algorithm. At least not at the stage I am working on now.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 25, 2010 6:08 AM
>Oh, you want me to change my reply there? Will do, as soon as I edit the knol itself, it doesn't address that point at all.

I don't mind, you're the host (but yes, it would be useful for other readers), I was rather thinking aloud/recalling that stuff/searching for connections...

>This is backwards. Buffering is not higher than anything, there’s no abstraction going on, or any processing for that matter, just simple copying. It’s not higher in scope either, the macro-structure is hierarchy, sequence is a structure within its levels.

Buffering - OK, but I guess episodic memory may need discretization points and patterns (compression) before working. Yes, mind can get quick this to some extent.

>As for hippocampus, it’s not an add-on, the neocortex is (Hawkins is dead wrong here). Hippocampus, otherwise known as archecortex, is a primitive 3- layer structure, part of “reptilian brain”. Neocortex developed later, both in phylogeny & in ontogeny.

OK... Hmm... So do you suggest that archecortex/hippocampus function in lower species and what lasted in humans might be literal copying of perceptions - saving locations/mapping the environment/in order the animal to find its lair and remember where there was food. (This makes sense to me.)

I suspect that functionally archecortex might be a recorder/associative memory, but lacks generalization/compression/prediction capabilities, or may have some compression but as a by-effect like using low resolution/precision. BTW I've been studying about reptilian brain, checked a little about amphibian's (it's 3-layers as well) - amphibians seem much closer to reptiles than reptiles to mammals, a smaller evolution step, supposing easier to grasp something meaningful about it. Indeed, isn't archecortex even amphibias ancestory (archipallium), at least what I've read is reptiles have also a neopallium, supposed to had evolved into neocortex.

>The fact that hippocampus is necessary to form declarative memories is an evolutionary bug

OK, interesting. :)


> evolutionary bug (brain is full of those). Ideally, the neocortex + thalamus, maybe striatum, should be doing all the work, the rest of the brain can go extinct.

I also think genes are messy and brain design is "spaghetty code". Brain has been patched over and over and early design decisions had been dragged all the time, because it was not possible otherwise using this technology. The higher layers had to be adapted to use the lower ones and I think bugs are very likely to come when there are functional overlaps between a new module and an old module. Such ones can be found easily, depending how deep the system is analyzed. I guess this can be the case with hippocampus and neocortex, because neocortex is also recording/copying perceptions.

Sorry to mention software engineering, but I guess this bugs-issue might be related to so called "coupling": http://en.wikipedia.org/wiki/Coupling_ (computer_science)

>Look, neuroscience at its current state is just a vague inspiration for understanding intelligence, you need to keep it separate from theoretical work.

OK...

>but can we please get on? For example, stop abusing "computerese" terms & acronyms that really just obscure the subject (for yourself). The neuroplasticity stuff sounds random to me. If you have any ideas you want to discuss *now*, great, but you need to formalize them. Otherwise, your signal -to-noise ratio is too low to bother. Seriously, you *talk* about compression, why not try to practice it?

I understand, I will get on, these terms were very old; I try to compress sometimes, but still often couldn't afford long-enough sustained concentration, distracted with other things waiting in the pipelines to be done...

>You don’t need to know any engineering to understand these things.

That's right about understanding, and other good concepts are also simple: pipeline (FIFO-related), branch prediction (prediction), out-of-order execution, superscalarity & parallelism in general. Practicing engineering helps keeping in mind they might be useful.

I guess that may go also for general software engineering guidelines, design patterns etc.

>Yes, there’re plenty of useful ideas in engineering. But, for a strictly incremental approach, selecting the right ones is harder than deducing them from the first principles (kind of like picking good ideas from your writing). The economics change as you get into advanced math & engineering, but these should not be necessary for a basic learning algorithm. At least not at the stage I am working on now.

OK

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jun 27, 2010 7:38 PM
> I don't mind, you're the host (but yes, it would be useful for other readers),

What, both of them? Done.

> Buffering - OK, but I guess episodic memory may need discretization points and patterns (compression) before working. Yes, mind can get quick this to some extent.

Discretization, yes, that’s multi-stage buffering. Compression: only symmetrical, non-selective transforms. Selection = elevation, this is already hierarchical processing, buffering is within levels. Any discontinuous comparison (pattern discovery) generates redundant representations, thus requires selection to be compressive.
Also, buffering is more useful for spatial focus shifts, which are reversible, than for purely temporal “obsolescence”, which is not. Of course, reversal can be over derived, as well as original, coordinates.

> OK... Hmm... So do you suggest that archicortex/hippocampus function in lower species and what lasted in humans might be literal copying of perceptions - saving locations/mapping the environment/in order the animal to find its lair and remember where there was food. (This makes sense to me.)

I was talking about buffering in conceptual terms, hippocampus probably does bunch of other things too.

> BTW I've been studying about reptilian brain, checked a little about amphibian's (it's 3-layers as well), amphibias seem much closer to reptiles than reptiles to mammals, a smaller evolution step, supposing easier to grasp something meaningful about it.
Isn't archicortex amphibias ancestory (archipallium), at least what I've read is reptiles have also a neopallium, supposed to had evolved into neocortex.

Perhaps, don't know much about it.

> The higher layers had to be adapted to use the lower ones and I think bugs are very likely to come when there are functional overlaps between a new module and an old module. Such ones can be found easily, depending how deep the system is analyzed. I guess this can be the case with hippocampus and neocortex, because neocortex is also recording/copying perceptions.

Not also, almost all memory (sequential & hierarchical) is in neocortex. But we didn't evolve as free thinkers, in evolutionary context "important" information is about things that are "close" to you. I don't think hippocampus holds or transfers much memory, but it associates memories with locations, & strengthens the ones that are | will be "close". I am sure neocortex is perfectly capable of representing maps (as in temporal lobe), but hippocampus already did that, & was left at it. So, neocortex evolved to depend on hippocampus to tell it what's important enough to be conscious of (declarative).

> That's right about understanding, and other good concepts are also simple: pipeline (FIFO-related), branch prediction (prediction), out-of-order execution, superscalarity & parallelism in general. Practicing engineering helps keeping in mind they might be useful.

Right, but it also gives you a "man with a hammer" syndrome. Thinking in terms of engineering about the problem is one thing, actually training / working as an engineer on unrelated projects creates biases you're not even aware of. And all possibly practical projects are utterly *unrelated*.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jun 30, 2010 6:08 PM
Sorry if you don't really care about this neuro stuff, I put it for completeness about paliums, because archicortex even seems to be fish brain...

fish --> archipalium --> hippocampus
amphibia --> paleopalium --> cingular cortex and other limbic cortex parts
reptiles --> neopalium --> neocortex

http://wiki.cns.org/wiki/index.php/Paleopallium/archipallium
http://en.wikipedia.org/wiki/Limbic_cortex

T>> I don't mind, you're the host (but yes, it would be useful for other readers),
B>What, both of them? Done.

Cool, I may include it in my resume! :P

B>Also, buffering is only useful for spatial focus shifts, which are reversible, not for temporal “obsolescence”, which is not.

Maybe I don't understand your point correctly, but I guess buffering of any irreversible sequences would be advantageous, namely because it would be impossible to go back and re-input them by the sensors.

B>Not also, almost all memory (sequential & hierarchical) is in neocortex.

OK, I was emphasizing on the supposed functional overlap - quality rather than quantity...

T>> That's right about understanding, and other good concepts are also simple: pipeline (FIFO-related), branch prediction (prediction), out-of-order execution, superscalarity & parallelism in general. Practicing engineering helps keeping in mind they might be useful.

B>Right, but it also gives you a "man with a hammer" syndrome. Thinking in terms of engineering about the problem is one thing, actually training / working as an engineer on unrelated projects creates biases you're not even aware of. And all possibly practical projects are utterly *unrelated*.

I don't mean that these ideas solve themselves AGI problem, they are general optimization/implementation suggestions that might speed up and would be useful for "the solution". Computer engineering is mostly about providing raw speed (and I think so far it is incremental there), and optimizations of the implementation might be important in the very beginning of AGI.

Also, I like "big engineering" - design reaching to inventive ideas, architectural innovations, understanding leading to leaps - like IBM "Stretch" or some of Cray's computers. It's like science and art. However, to have a chance to practice professionally that kind of engineering you depend a lot on social status, which is usually gained with many years of activities, most of which consisting of predictable and boring, not creative work, solving problems that just need time to implement and debug.

Actually I do agree that you shouldn't practice that kind of engineering for too long, not to spend decades or a career there. You can grasp the important ideas and think of architectures with much less efforts.

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jul 1, 2010 5:00 PM
> Sorry if you don't really care about this neuro stuff,

It's fun, but doesn't seem to be terribly relevant. I prefer to think in terms of function, biological analogues are too macro, functionally mixed, & not well understood. That goes for “engineering” discussion too :).

B>Also, buffering is only useful for spatial focus shifts, which are reversible, not for temporal “obsolescence”, which is not.

> Maybe I don't understand your point correctly, but I guess buffering of any irreversible sequences would be advantageous, namely because it would be impossible to go back and re-input them by the sensors.

You're thinking in terms of costs, not benefits. Experience has no intrinsic value, the purpose is to predict, & you don’t predict the past. The data in a buffer is addressable by its coordinates, you retrieve it if:
a) the location of expected inputs matches that in a buffer again, in case of spatial shifts,
b) the pattern in new inputs is stronger than average, which means it should search further than expected, both forward & backward (in the buffer).

The second reason is equally valid for both spatial & temporal shifts, which is why I’ve corrected my previous reply before you answered it: “buffering is *more* useful for spatial focus shifts”. Sorry to keep changing it on you :).

So, you’re right, buffering for the second reason would be more important in irreversible shifts. But I think potential proximity is a lot more important reason to buffer data, - space is multi-dimensional & prediction is far more affected by external impacts than by past trajectory of the pattern.

EditDeleteReport abusive comment
Posted by Boris Kazachenko, last edited Jul 2, 2010 1:59 AM
Sorry this comment came up a bit long (split in two), but I needed to include the explanations.

Edited: I shortened it, too long a comment, I'll post details in my own place.

T>> Sorry if you don't really care about this neuro stuff,
B>It's fun, but doesn't seem to be terribly relevant. I prefer to think in terms of function, biological analogues are too macro, functionally mixed, & not well understood. That goes for “engineering” discussion too :).

I agree functional is a cleaner way, but I'll share a bit more, it's "neuro- functional-behavioral-evolutionary" and it's related to the other thread.

I suspect that minicolumns and neocortex structure and functionality could be reproduced by a series of simple transformations from simpler structures, such as replication, extension of range of connections, variation etc.. Somewhat neocortical functions were implied in fish, amphibian and reptile brains.

(...)

B> You're thinking in terms of costs, not benefits. Experience has no intrinsic value, the purpose is to predict

It hasn't, but the more complex/higher resolution and faster-than-"full"- evaluation-in-real-time the environment, the more having exact records might be important for delayed evaluation, because it gets impossible to decide on the spot would the information be predictive in the future.

...Continues...

DeleteBlock this userReport abusive comment
Posted by Todor ArnaudovInvite as author, last edited Jul 10, 2010 10:00 AM
...Continues from the previous one...

Edit: Shortened, more detailed explanations would be posted in my blog or so.

>& you don’t predict the past.

Depends on what you mean with the past, sometimes you *do predict* the past: recalling older past and searching for the reasons of how it got to the younger past, because you missed some details or didn't understood them then.

(...)

B> But I think potential proximity is a lot more important reason to buffer data, - space is multi-dimensional & prediction is far more affected by external impacts than by past trajectory of the pattern.

Maybe, but I suspect this might be over generalized, I guess a good mind should be able to adapt, depending on experience and available resources. If mind has to decide, it should try to predict what might be more important and would it be useful to buffer what.

Some comments, pardon my typos

Boris,

I am no expert but I am a Ph.D. student in Cognitive Psychology with a specialization in Neuroscience. I am a little lost reading your post, in part due to a mismatch in your terminology with the fields of neuroscience and cognitive neuroscience. I think your ideas as I understand them are interesting but I would like to make a few corrections comments.

In response to, "A functional unit of neocortex is a minicolumn, which seems to perform recognition / generalization function.” Microcolumns (or minicolumns) neither perform "recognition" or "generalization" but are involved in sensory processing such as vision, audition, smell etc. Your ability to recognize something as an object or your ability to make conceptual generalizations are high level cortical functions. Microcolumns are organized anatomical structures that process particular features. For example in V1 or primary visual cortex, a specific neuron termed simple cells respond preferentially to specific features such as line orientation. These cells distinguish between different lines orientations (such as / | \ ) by changing neural firing rate. These cells will respond strongly to a preferred orientation but may fire to a lesser extent to other orientations—the greater the difference in orientation between the stimulus and the preferred orientation, the less the cell will fire.

Recognition, a memory function, and generalization, the ability to transfer learning to a novel or related situation, are distinct abilities and brain processes.

I would caution you to generalize, no pun intended, work regarding autism or other patient populations to make claim about individual differences in normal human cognition. Non-autistic and autistic individuals can both do feature processing, can perceive objects by integrating features and can remember those objects. Some but not all autistic individuals have difficulty with conceptual information. Autistic individual are feature focused though they do have some interesting high level perceptual deficits with objects and faces. I recommend the following paper:

Gastgeb, H.Z., Strauss, M.S. & Minshew, N.J. (2006). Do individuals with autism process categories differently? The effect of typicality and development. Child Development, 77(6), 1717-1729.

The concept of IQ and intelligence is an extremely dicey subject. I recommend the following “Tall Tales about the Mind and Brain: Separating Fact from Fiction” by Sergio Della Sala – the chapters are written by highly regarded (cognitive) neuroscientists.

In closing I want to say that cognitive neuroscience is a long ways off from addressing the type of questions that interest you—we simply aren’t there yet—(see Bruer’s paper Education and the Brain: A Bridge Too Far in the journal Educational Researcher, v26 n8 p4-16 Nov 1997). There is some information but the field of Cognitive Psychology can more thoroughly address your ideas. But I do recommend the following papers.

Morrision, Krawczyk, Holyoak, Hummel, Chow, Miller, Knowlton (2004). A Neurocomputational Model of Analogical Reasoning and its breakdown in Frontotemporal Lobar Degeneration. Journal of Cognitive Neuroscience 16(2), 260-271.

Waltz, Knowlton, Holyoak, Boone, Miskin, de Mendez Santos, Thomas, Miller. (1999). A system for relational reasoning in human prefrontal cortex. Psychological Science, 10(2), 119-125.

And finally some shameless self-promotion, my chapter on the brain and expertise may be of interest to you. The entire book may be of interest to you, my chapter is the only one that involves neuroscience, the rest deals with the expertise as studied by cognitive psychology.

Hill, N.M. & Schneider, W. (2006). Brain changes in the Development of Expertise: Neuroanatomical and Neurophysiological Evidence about Skill-based Adaptations. In K. A. Ericsson, N. Charness, P. Feltovich, and R. Hoffman (Eds.), Cambridge Handbook of Expertise and Expert Performance. New York: Cambridge University Press.





Nicole M Hill


Last edited Jul 3, 2010 10:21 PM
DeleteBlock this userReport abusive comment
Thanks for the comments & references Nicole!

The mismatch in terminology is indeed formidable, & reflects corresponding mismatch in our conceptual frameworks.
First of all, recognition/generalization are distinct high-level abilities only if you define them as such for some high-level cognitive test. Unless so specified, recognition/generalization is simply a discovery of common elements among multiple inputs. Algorithmically, it's an iterative comparison (which discovers match) & projection (that determines which inputs are compared): a step producing incrementally more general patterns/concepts. This step is not specific to any level of complexity. Generalization starts from sensory processing, such as line angle recognition in your V1 example, & continues into Association Cortices. To say that minicolumns do not perform generalization is a bit absurd. Neocortex consists of little but minicolumns (see "Cortex & Mind", p.26) and every cognitive function can be reduced to generalization.

Thanks for the pointer to Gazzaniga's article, I will mention it in the knol. "The evolutionary perspective" chapter there indirectly supports my premise: hemispherical asymmetry can be summarized as a relatively higher-generality bias of the left hemisphere. This seems to be a distinctly human feature, producing hugely greater overall generalization ability compared to our nearest relatives. The hemispheres do not normally operate independently, they are densely interconnected by CC. Some of this connectivity is to provide simple fault-tolerance & sensory-motor field integration, as in animals. But because of the asymmetry ("lateralization") in humans, the transfer of data between hemispheres will likely be between different levels of generality. This mismatch will add another step of generalization to the hierarchy of the left hemisphere.

I couldn't find your chapter online(?), but you seem to work with MRI, which too high a level for me. I think the most interesting part is processing within a minicolumn, at the most a macrocolumn.
Cognitive Psychology is also too high-level for me, I am into the most basic mechanisms of cognition. Neuroscience can be quite suggestive, given a meaningful theory. My ideas here are difficult to understand out of the context of my "Intelligence" knol: http://knol.google.com/k/boris-kazachenko/intelligence/27zxw65mxxlt7/2# though it's a lot more abstract.

Appreciate you interest and the references, though it may take me a while to get to them, as this is not my main focus.

Boris.