Cognitive Focus

11/20/21

Cortical trade-offs in generalist vs. specialist bias

Neocortex is loosely organized as a hierarchy of generalization, in which stimuli selectively propagate from primary to association areas ("Cortex & Mind", Cortical Memory by Joaquin Fuster). Higher areas represent increasingly general patterns or concepts: spatial and temporal receptive field per neuron and cortical column expands with their elevation in cortical hierarchy.

Generalization is basically a pattern discovery process. In neural implementation, pattern is a coincidence of multiple inputs, or presynaptic spikes. Sufficient number of coincident spikes triggers Hebbian learning: “fire together, wire together“ between simultaneously spiking neurons. More precisely, a synapse is strengthened if pre-synaptic neuron fires just before the post-synaptic one.

Most of neocortex is connections between neurons (dendrites and axons), plus their life support. Given limited resources within a skull, there must be a tradeoff between total number of connections and their average length. In other words, a network can be relatively dense, with more connections of shorter average length, or sparse, with fewer total connections of greater average length.

The choice of coincident inputs becomes exponentially greater with the length of connections. Hence, stronger patterns (greater number and closer timing of coincident input spikes) can be discovered. But that range must come at the cost of having fewer total connections, thus less detailed memory. Which requires greater selectivity in learning: longer reinforcement to form and strengthen synapses.

So, other things being equal, there must be a tradeoff between speed and detail of learning, and scope and stability of learned patterns. Relatively dense hierarchy prioritizes speed and detail, I call it a “specialist bias”, while sparse hierarchy selects for scope and persistence: my “generalist bias”.

Cellular factors in density vs. range tradeoff.

Initial determinant of cortical density is the rate of division and survival for neuronal progenitor cells during cortical development. Slower division or faster die-off leaves fewer progenitor cells, which will form fewer cortical neurons. That should leave more space and resources to grow longer connections among them. This is likely determined by nerve growth factors and receptors: higher activity should form denser network. One such factor maybe CATNAP2 gene, expressed mostly in prefrontal and parietal cortices, and probably correlated with autism: Genes for autism or genes for connectivity.

The most recognizable feature of neocortex is its six layers. Deeper and older layers VI and V mostly mediate cortico-subcortical integration, layer IV propagates data flow upward the cortical hierarchy via thalamus, and newer layers II and III provide intra-cortical connectivity, mostly via layer I axons.

Henry Markram reported innate ”peak connectivity” of layer V pyramidal cells at 300-500 mu. There are ~50 cell clusters (representation units) interlaced within that distance. It seems to me that these clusters provide minimal representation redundancy and mutual support via reverberating firing. Each cluster probably responds to some specific intensity of stimulus. These clusters inhibit each other within a column: “Sparse distributed coding model…“ to adjust for redundancy within receptive field. So, variation in the range of such peak connectivity may be one of dense vs. sparse factors.

A unique feature in human brain (and to a lesser extent in other primates and whales) is spindle cells. Wikipedia: "Spindle cells emerge postnatally and eventually become widely connected with diverse parts of the brain, evidencing their essential contributions to the superior capacity of hominids to focus on difficult problems." Axons of spindle cells are less branched than those of pyramidal neurons, and their extended range must come at the expense of reduced density of other connections. This trade-off probably enables better top-down (general-to-specific) focus in humans.

Another possible factor in the trade-off is the ratio of glia to neurons, which also determines sparsity. This is an excerpt from the "The Root of Thought": "As we move up the evolutionary ladder, in a widely researched worm, Caenorhabditis elegans, glia are 16 percent of the nervous system. The fruit fly’s brain has about 20 percent glia. In rodents such as mice and rats, glia make up 60 percent of the nervous system. The nervous system of the chimpanzee has 80 percent glia, with the human at 90 percent. The ratio of glia to neurons increases with our definition of intelligence."

However, his interpretation that glia a main information processing component in human brain is implausible, I agree with mainstream opinion that they mainly provide support for neurons.
Greater proportion of glia would reduce density of neurons, but enable higher activity and longer-range connections for each of them. Again, that means a sparser neural network.

Differences between higher and lower cortical regions

Very roughly, cortical hierarchy consists of four sub-hierarchies, listed from the bottom up:

- spectrum of primary-to-association cortices, within each of sensory and motor cortices
- posterior sensory and anterior motor cortices, the latter is somewhat higher in generalization
- lateral task-positive and medial default-mode networks, the latter is somewhat higher
- right and left hemispheres, the latter is somewhat higher

Joaquin Fuster on the differences between primary and association areas in Cortex & Mind, p. 73:
“At the lower level, representation is highly concrete and localized, and thus highly vulnerable. Local damage leads to well-delimited sensory deficit. In unimodal association cortex, representation is more categorical and more distributed, in networks that span relatively large sections of that cortex… In transmodal areas representation is even more widely distributed… P. 82: “Thus a higher-level cognit (e.g., an abstract concept) would be represented in a wide network of association cortex…”
In my terms, wider networks imply “sparse bias” on higher levels of generalization.

Similar quotation via “How to Create a Mind” by Ray Kurzweil, p. 86: A study of 25 visual and multi-modal cortical areas by Daniel Felleman found that “As they went up the neocortical hierarchy,.. processing of patterns comprised larger spatial areas and involved longer time periods“.
Another study by Uri Hasson stated: “It is well established that neurons along the visual cortical pathways have increasingly larger spatial receptive field.” and found that “similar to cortical hierarchy of spatial receptive fields, there is a hierarchy of progressively longer temporal receptive windows”.

The neocortex is myelinated sequentially from primary to association areas at correspondingly increasing age (up to ~30 year old for prefrontal cortex), and myelination then seems to decline in reversed order ("Human Neurophysiology", page 197). Allowing for a multi-year delay in knowledge accumulation, this probably reflects and / or determines the age at which abilities peak in fields that require knowledge of corresponding generality. It's known that athletic abilities (primary cortices) peak in early 20s, and mathematical skills (likely parietal cortex) a bit latter.

On the other hand, performance in business, politics, social sciences, and literature (prefrontal cortex?) doesn't peak until late in life. This is probably even more true in philosophy, but performance metrics there are questionable. Also supportive is the observation that cortical development sequence is delayed by several years in subjects with ADHD. Obviously, effective generality of discovered concepts, thus also development of higher association areas, depends on attention span.

Comparison between parietal and prefrontal cortex (highest levels of sensory and motor cortices):

Buschman & Miller of MIT, ref:“ have found two types of attention in two separate regions of the brain. The prefrontal cortex is in charge of willful concentration; if you are studying for a test or writing a novel, the impetus and the orders come from there. But if there is a sudden, riveting event—the attack of a tiger or the scream of a child—it is the parietal cortex that is activated. The MIT scientists have learned that the two brain regions sustain concentration when the neurons emit pulses of electricity at specific rates—faster frequencies for the automatic processing of the parietal cortex, slower frequencies for the deliberate, intentional work of the prefrontal."
I think lower frequency here is due to longer feedback loop of higher levels.

I don’t have much info on task-positive vs. default-mode networks, the latter is not well understood. Next is cortical hemispheric asymmetry, AKA lateralization:

Left hemisphere represents higher-generality and long-term-goal- associated concepts, while the right one mostly searches in the background, for lower-level contextual patterns (Cortex & Mind, p. 184, Split Brain, Gazzaniga). According to my premise, left hemisphere should be relatively “sparse”, which is supported in “Cortex & Mind“, p 185: “Pyramidal cells in language areas have been found to be larger on the left than on the right (Hayes & Lewis, 1995; Hustler & Gazzaniga, 1997)”. Their dendritic trees also extend further than those of right-hemisphere pyramids (Jacobs & Schneibel, 1993).

Another study: "Hemispheric asymmetries in cerebral cortical networks" found that columns in left hemisphere contain fewer minicolumns and better myelinated axons than corresponding areas of right hemisphere, with same volume and number of synapses.
Hemispheres are densely interconnected by Corpus Callosum. This is partly for sensory-motor field integration and duplication (fault-tolerance). But greater “lateralization” in humans, vs. other primates, suggests that our hemispheres also combine in a deeper hierarchy of generalization. Finish study found that ambidexterity (correlated with lesser lateralization) doubles the risk of ADHD and lower academic performance in children.

Autism as a dense-connectivity cognitive style.

The best evidence for individual differences in cognitive focus comes from research on autism, which is known to increase attention to details, often at the expense of higher generalization. So, it’s a good proxy for a "specialist phenotype", which according to my thesis should display greater short-range vs. long-range connectivity. Below, I summarize some evidence for such bias in autism.

Casanova in "Abnormalities Of Cortical Circuitry In The Brains Of Autistic Individuals" reports that autistic individuals have more numerous but smaller and more densely packed minicolumns, each containing smaller than normal neurons with shorter axons (I came across it via A Shade of Gray: excellent review of relevant research, highly recommend).
Related study "Comparison of the Minicolumnar Morphometry of Three Distinguished Neuroscientists and Controls" by Casanova is reported in "Minicolumns, Genius, and Autism". The connectivity pattern of the neuroscientists appears to be similar to autistics in the density and size of minicolumns, but different in better inhibitory isolation between adjacent minicolumns. This should compensate for smaller size, while enabling greater number of minicolumns.

Similar finding of more compact and better insulated minicolumns in primates and cetacea, compared to cats and lower mammals, was reported in A comparative perspective on minicolumns and inhibitory GABAergic interneurons in the neocortex.
The thesis of local vs. global connectivity bias in autism is also supported in Exploring the Folds of the Brain--And Their Links to Autism by Hilgetag and Barbas: "in autistic people, communication between nearby cortical areas increases, whereas communication between distant areas decreases".
Such cortex should be more reliant on cortico-thalamo-cortical vs. cortico-cortical connections, which might be the implication in Partially enhanced thalamocortical functional connectivity in autism.

Henry Markram, a leading neuroscientist, a father of autistic son, and “pretty much an autist myself”, proposed The intense world theory - a unifying theory of the neurobiology of autism.:

“The proposed neuropathology is hyper-functioning of local neural microcircuits, best characterized by hyper-reactivity and hyper-plasticity. Such hyper-functional microcircuits are speculated to become autonomous and memory trapped leading to the core cognitive consequences of hyper-perception, hyper-attention, hyper-memory and hyper-emotionality. The theory is centered on the neocortex and the amygdala, but could potentially be applied to all brain regions. The severity on each axis depends on the severity of the molecular syndrome expressed in different brain regions, which could uniquely shape the repertoire of symptoms of an autistic child. The progression of the disorder is proposed to be driven by overly strong reactions to experiences that drive the brain to a hyper-preference and overly selective state, which becomes more extreme with each new experience and may be particularly accelerated by emotionally charged experiences and trauma. This may lead to obsessively detailed information processing of fragments of the world and an involuntarily and systematic decoupling of the autist from what becomes a painfully intense world. The autistic is proposed to become trapped in a limited, but highly secure internal world with minimal extremes and surprises.”

"In the early phase of the child's life, repetition is a response to extreme fear. The autist perceives, feels and fears too much. Let them have their routines, no computers, television, no sharp colors, no surprises. It's the opposite of what parents are told to do. We actually think if you could develop a filtered environment in the early phase of life you could end up with an incredible genius child without many of the sensory challenges."

"The main critical periods for the brain during which time circuits form irreversibly are in the first few years (till about the age of 5 or so). We think this is an important age period when autism can either fully express to become a severe handicap or turned to become a major advantage. We think a calm filtered environment will not send the circuits into hyper-active modes, but the brain will keep most of its potential for plasticity. At later ages, filtered environments should help calm the autistic child and give them a starting point from where they can venture out. Each autistic child probably will first need its own bubble environment before one can start mixing bubbles. It should happen mostly on its own, but with very gentle guidance and encouragement. Do all you would want for your child ... but in slow motion ... let the child set the pace ... they need that control to feel secure enough to begin to venture off into any other bubbles."

Recent study found one reason for such intense perception: reduced synaptic spine pruning in autistic brains, secondary to mTOR over-expression. This seems to happen at the age of 3-4 years, when synaptic spine density normally decreases by ~50%. I suspect reduced pruning may also happen during prenatal development. If so, greater synaptic density should increase activity, thereby reducing normal prenatal die-off of neurons. This would explain minicolumnar differences noted by Casanova.

Either way, more numerous synaptic spines increase density of connections in the cortex, which must ultimately come at the expense of their effective range.
Truly pathological autism probably requires more than increased synaptic density and activity. The ultimate cause might be something as basic as pre|post- natal viral infection or retroviral expression, combined with low vitamin D levels (as is likely the case for schizophrenia and bipolar disorder).

Sparse connectivity as a risk factor for schizophrenia.

One such factor is greater ratio of astrocytes to neurons in schizophrenia, specifically in prefrontal cortex (astrocytes is a type of glial cells, covered above). This imbalance was recently discovered in RIKEN study on stem cells and post-mortem brains of patients vs. controls. It seems to be due to reduced expression of gene DGCR8. Fewer neurons make network more vulnerable to acute damage, but more astrocytes can better maintain remaining neurons for regular wear and tear of our long life.

Duke University study found a more direct “sparse” risk factor for schizophrenia: increased synaptic pruning "‘Spine pruning theory is supported by the observation that the frontal brain regions of people with schizophrenia have fewer dendritic spines, the tentacles on the receiving ends of neurons that process signals from other cells". But this increased pruning happens during puberty, probably secondary to increased testosterone, vs. reduced pruning at 3-4-year-old in autism.

Schizophrenia seems to be a uniquely human disorder, and there must be a reason these risk factors evolved. Other things that are unique for humans are large neocortex, complex society, and long life.
I think this risk and benefits are closely related: decreased density leaves more space and resources (such as astrocytes) for remaining neurons and connections, so they may grow longer. Which enables global intellectual integrity, thus dynamic social coordination, and long-term planning in general.

More specific “sparse disorder” may be dyslexia. This connection was also made by Manuel Casanova: “Autism and dyslexia: A spectrum of cognitive styles as defined by minicolumnar Morphometry“, although there is a lot less research on that. Basically, he thinks that dyslexia is caused or exacerbated by a “lossy” cognitive style, secondary to sparse connectivity, at least in language-oriented cortices.

Implications and speculations

Generalist vs. specialist tradeoffs are somewhat ambiguous in terms modern societal utility:
- On one hand, speed & precision was more important for survival in the wild, which may explain why apes seem to have photographic memory, superior to humans: Chimps beat humans in memory test.
- On the other hand, more recent functional differentiation of modern society once again requires increasingly “lossless” knowledge acquisition. Social positions that do require higher generalization are relatively few, including law, management, politics and related academic disciplines.

In terms of gender, men are obviously overrepresented among extreme specialists, and even more so among generalists. This should be expected: extremes, especially that of environmental detachment, are risky, and risk is a male domain. Males don’t contribute nearly as much to reproduction as females, in some species only their genes. Their additional purpose in evolution is to serve as a test vehicle for variations, initially genetic and later also memetic. Males are a far better target for sexual selection because their mutations are more likely to be expressed: they have only one X chromosome.

Relatively speaking, women don’t take chances. They have two X chromosomes to conceal mutations, more symmetrical brains as a backup for damage, stronger immune system and higher HDL. Same for behavior: they have lower testosterone and vasopressin to avoid risk, higher estrogen and oxytocin to seek and provide support, generally heightened senses to pay more attention to their bodies and immediate environment. Another salient difference is recently discovered higher myelination in female thalamus, likely related to faster and more frequent attention switching in women. All that must come at the expense of intellectual detachment and higher generalization.

Paradoxically, generalist bias may also be associated with smaller brain size, due to shorter global links. For example, it is known that low-generality savant abilities can be induced by inhibiting prefrontal cortex, presumably because top-down speculation competes with bottom-up perception. Inversely, shorter distances improve signal propagation across global networks (such as fronto-parietal, fronto- temporal, salience networks), suppressing bottom-up detail by top-down filtering. And selection for stronger patterns is necessary to compensate for reduced overall memory capacity of a smaller brain.

Another benefit of smaller size is potentially better quality of development over the same time: there is a well-known “slow growth vs. sloppy growth” trade-off in biology. Basically, slower growth allows for more time and resources to prevent and correct mistakes made during cellular division and other anabolic processes. For example, slower-growing axons are less affected by short- term fluctuation in gradients guiding their growth cones. So, they will be straighter, further reducing the length of connections among cortical areas. And shorter connections are associated with higher IQ.

Of course, this is contrary to conventional bigger-is-better view, supported by increasing brain size in human evolution. But this trend reversed after Neolithic revolution, which might not be a bad thing. Some margin of increased brain and body size is net-beneficial only for fight / flight emergencies. Given drastically improved security of settled society, that margin should become net-detrimental, by impairing cellular quality. For example, although animals of larger species generally live longer, smaller individuals of the same specie live longer than the larger ones in the absence of predation.

This is also true for women. But, women have proportionally less white matter and more grey matter than men, which compensates for shorter distances. And I think subcortical differences are even more important: lower testosterone and higher oxytocin makes women more sensitive to their immediate environment, especially social one. They’re better at bottom-up perception, but less free in top-down selection. Women do seem to have better integrity, but within a narrower range of interests.

On another note, IQ tests are inherently incapable of capturing higher generalization ability because they are time-limited. The tests are supposed to be background-neutral, except for verbal and math IQ. Thus, they can only measure our ability to discover patterns within data given to a subject during relatively brief test. That means they’re biased toward the speed of learning, where “sparse” subjects are at disadvantage. This is effectively confirmed by the finding that lobotomy, which disables prefrontal cortex (the seat of highest generalization levels), has little or no impact on IQ.

The same bias is built into any educational system: detail-oriented "dense" bias is better for passive knowledge acquisition. "Sparse" bias is better at independent research, but that’s far more difficult to evaluate. And modern science amassed a huge body of knowledge, which must be acquired before one can make a novel contribution. That's a major disadvantage for a generalist. Einstein’s assertion that “imagination is more important than knowledge” may no longer hold in established fields (not mine).

There's been a lot of talk about association between "genius" and autism, which I think is misleading for two reasons. First, the diagnosis of autism includes asocial behavior, which is irrelevant: anyone with unusual interests will be correspondingly "asocial". Closely related is avoidance of novelty, which is emotionally overwhelming for an autist. But a detached generalist would also avoid society and novelty, for the opposite reason: they are likely to be a trivial distraction to his own thoughts.

Second, it is far easier to recognize exceptional abilities of a specialist than those of a generalist. We all share lower generality levels, - that's where the data comes from, leaving less to interpretation. But effective generality of top association cortices definitely differs among individuals, and it takes an equally competent generalist to evaluate quality of generalizations. Which is why the work in psycho-social sciences, and especially in philosophy, is so vastly inferior to that in relatively lossless hard sciences. So, an autistic genius is far more likely to gain recognition than “anti-autistic” one.

Needless to say, this write-up is motivated by introspection.

Introspection

I design cognitive process from a functional definition, as open-ended hierarchical pattern discovery: www.cognitivealgorithm.info. Been working on that most of my life, anything else is trivial by comparison. On my own, because nothing I’ve come across is coherent enough. And because I can, emotionally, recently financially, and even more recently with decent concentration.

Real intellectual integrity seems to be extremely abnormal in people, probably a developmental artifact for me. Lacking nearly universal addiction to social support and tangible benefits (starting with experimental confirmation), I am free to focus on purely cognitive imperatives.

My first interests were geography and history, then physical sciences and biology. I majored in social science because modern society has the deepest structured complexity of any established subject. But that field lacks in academic integrity. And the most important process in society is discovery and invention, which is a composite of individual human learning. So, I switched to studying the latter, most of my life ago, both for intellectual depth and for potential impact on the world.

That doesn’t mean psychology and neuroscience. I got into both more recently, mostly for insight into our deficiencies. To understand intrinsic function of cognition, vs. tons of other things in human mind, sustained introspective generalization is far superior to mere observation. Having started with the former, I find everything about brain and neurons to be grossly sub-optimal. Which is not surprising for a product of blind evolution and severe biological constraints.

Formalizing cognition is also the only legitimate problem in philosophy, which was my interest for a while. But establishment philosophers are too busy bullshitting college freshmen and other clueless highbrows, they seem to have no time or interest for real work.

Then there is math, but that’s primarily deductive, while cognition is primarily inductive. Effective induction requires fine-grain selection: logically complex but mathematically simple. People like math for its certainty, but that’s mutually exclusive with complexity of a subject. Which don’t get more complex than effective intelligence. I picked complexity and speculation first, certainty had to wait.

Logical complexity is the province of coding, but that requires a constructively defined objective first. It was never defined for cognition, so it took me a lot of work before I could start programming. And it didn’t help that I first tried C, which is horrible for anything conceptually complex. So I went off to work with my own pseudocode, until I realized that there is already Python for that.

I skip on biography because mine is a life of mind, the rest is a distraction (I had plenty of that). Throughout history, working alone on my problem would be of no consequence. Things changed: publish on the net and Google will find you with the right keywords, status and credentials be damned. And convincing people is not even necessary anymore, all you really need is working code.

Still, a constructive conversation would be nice for now, seeing that I am short of the former.

Anything I write is meant to be substantially original, thus speculative. But the subject is king, I never stop questioning assumptions and all my posts are a work in progress

11/4/18

Cultivating top-down focus

Theoretical work is driven by sustained top-down attention. The top is long-term priorities, derived from broad generalizations, and the bottom is current experience. Evolution always neglected long-term: people didn’t survive very long unless they paid close attention to their immediate environment. Modern society is drastically more secure, but our attention spans barely budged. In fact, it’s been getting worse for the majority lately, - they just elected ADHD-addled clown-in-chief.

“Thinking is to people as swimming is to cats: they can do it but prefer not to” Daniel Kahneman.

“I have no special talent, I am only passionately curious” Albert Einstein.

A lot of people could become world-changing geniuses, if they spent 10 years of their youth fully focused on important problem. But that must come at the cost of “life“: unthinkable for hand-to-mouth hunter-gatherers that we evolved to be. I first decided on my top priority in the adolescence. But maintaining effective working focus on these abstractions, vs. “real” distractions, was far more difficult. Over the years, I majorly improved my concentration via following techniques:

Practice, externalization, formalization

Anything profound is initially boring, curiosity is cultivated by incrementally deep study. Which forms redundant representations, differentiated by their context to explore alternative scenarios. This redundancy helps to maintain parallel subconsciously searching threads, even when your consciousness is distracted. They also fill-up memory and starve unrelated subjects out of resources. This is very important: irrelevant memories keep competing for our attention until they faint out.

Obsessed with externalities, we need a conducive environment to facilitate virtuous cycle of practice. Basic working environment is a notepad or a computer screen, so we need to fill them with a well designed write-up of the subject. The brain obviously has plenty of memory for a few pages of text, scarce resource here is our attention. Writing down thoughts turns them into a sensory feedback, which is far more effective at maintaining conscious attention than “internal” abstractions. Same for motor feedback: verbalizing, writing by hand, semi-random editing or re-arranging text or code.

Another focus aid is formalization: developing subject-specific terminology, abbreviations, symbols. This is critical for building concise and comprehensive model of a subject, structured to reverberate within working memory. Such model must be incrementally refined and extended, nothing worthwhile can be done on the first try. Refining means resolving internal contradictions and eliminating overlaps or the irrelevant, vs. simply accumulating related aspects and perspectives.

Stimulation and avoiding distractions

One of the most important “environment and stimulants” is people we deal with. Your listener's attention (if credible) stimulates yours, even when he doesn't contribute anything. To facilitate this, universities and companies impose face-to-face contact among colleagues. However, relevance of these institutions themselves depends on societal consumer competence, which is sorely lacking on higher-generality subjects. But social stimulation can be replaced by writing or talking to oneself.

Beside relevant stimulation (be honest about “relevant“), one must block the irrelevant one. Real-life socializing is almost always meaningless, compared to impersonal reading and writing. People are desperate to join a group and rejection feels like a death sentence. But if there is no sufficiently relevant group, any socializing is huge waste of mindspace. However miserable social isolation feels at first, you get used to it. Aside from broad stimulation and clear purpose, attention is a zero-sum game.

Such broad stimulation is easy: tea, cocoa, and low-dose nicotine (patch) do it for me. As distinct from smoking, nicotine itself is pretty benign, see Gwern. For less intrinsically stimulated, there are ritalin, adderall, deprenyl, modafinil, etc. Another potent stimulant is exercise while working. I work on a treadmill desk and alternate between walking, standing, and sitting, all while remaining in front of projector screen (which is more “immersive” and distant than a monitor: it doesn’t jump in the eyes as much when you walk). Highly recommend, it probably added ~3 hours of work per day.

Beside socializing, the worst attention hog now is the web. Dealing with it was a big challenge, until I discovered Cold Turkey. I block all but work-related whitelist most of the day. Sounds trivial, but it made a huge difference to my concentration. You may even want to lock yourself in for a fixed time, just put the key in kitchen safe.

Direct self- conditioning

But even more insidious, at least for a generalist like me, is internal distractions: wandering thoughts.

There is a low-tech solution: aversive conditioning. It can be simple and old-fashioned: just slap your face when distracting. But it’s a war with your own reptilian brain, slapping must become reflexive, you shouldn’t have to decide on it. Then irrelevant subjects acquire unpleasant associations and you will avoid them. Of course, that depends on mindfulness: monitoring your thoughts for distractions.

Positive conditioning of relevant thoughts is far more difficult: they are fluid, subconcsious and don’t associate with specific cues for conventional reinforcement. Less specific but still helpful is reserving specific desk, computer, and time only for work. Also useful is neurofeedback, article. I currently use a very simple version: every day, I write down the number of hours spent on work, multiplied by their effectiveness relative to average effectiveness of recent working hours. It does help a bit.

Advanced neurofeedback may become possible by transcranial imaging to visualize cortical activity. Eventually, we will directly stimulate cortical areas that represent relevant subjects. Stimulation by red and infrared light is already feasible, but very imprecise. Overall, top-down attention seems to be coordinated by left dorsolateral prefrontal cortex: the highest level of task-specific generalization, while symbolic and mathematical processing is in left inferior parietal cortex / angular gyrus.

Deliberate control over the subject of attention will be the most profound revolution yet: it will change what we want out of life. But waiting for technology will leave you hopelessly behind those who do it old-fashioned way. Of course, most of us dressed-up apes don’t care, - there are bananas to be picked.

7/7/12

Consciousness as an artifact of brain-to-body bottleneck

Conscious attention, implemented as working memory, is focused on a single or few items at a time. Intrinsically, cognition doesn't need such central focus: our brains are massively parallel. Ideally, the pool of neurons should be allocated to many subjects of interest by something like a market, according to predictive value of these subjects. As it probably happens in unconscious or intuitive cognition.

But the brain evolved to guide a single body, which in most respects can only do one thing at a time. Hence the artifact of central consciousness. Beside disrupting smooth allocation of cognitive resources, this bottleneck obviously favors somatic concerns, which constitute lower forms of human motivation.

Such sequential focus is likely implemented by thalamus, which serves as a central switchboard for the brain. It seems to invoke consciousness by generating higher-frequency brainwaves, especially gamma waves, which bind together areas related to working memory (brief overview: The Missing Moment by Robert Pollack, pp. 46-56, or a far more involved treatment: Rhythms of the Brain by Gyorgy Buzsaki.

My personal opinion is that main function of thalamus is to mediate competition between brain areas, particularly via TRN. From a networking perspective, it’s a lot cheaper to do this in a central body, as opposed to each region or column directly inhibiting all others. In fact, Sherman and Guillery suggest that a thalamus could be viewed as a consolidated “7^th layer” of neocortex (“Exploring the Thalamus”).

Primary sensory and motor cortices seem to be overrepresented in thalamus, - pulvinar nuclei alone comprise 40% of it. Better thalamic connectivity of primary cortices should enhance search for relevant associations in other brain areas. This is introspectively plausible: most of working memory is what we currently visualize, vocalize, or actualize. I think we enhance our focus on general concepts in the same fashion: by generating fake experiences of subvocalizing, subvisualizing, and subactualizing.

This is probably mediated by feedback to primary cortices, underutilized during sensory “vacations”.
So, primary cortices are frequently “hijacked” by higher areas to simulate (interactively project) their generalized concepts. Such “primarization” is particularly important for mathematicians, engineers, and scientists, who work on imaginary constructs and often think visually rather than verbally.

However, primary cortices are unnaturally “low” for such subjects. Because “elevation” is wrong, these projections often become false memories, confabulations, hallucinations, - substitution of imagination (feedback) for actual experience (feedforward). This may be a factor in developing schizophrenia, in which imagination seems to get out of control. It is suggestive that default mode network, and specifically left posterior cingulate cortex, were found to be unusually active in schizophrenics.

Such confusion should be more likely in habitually hijacked primary areas, which may become less attached to their respective senses. More general concepts are represented by higher association cortices. The highest area seems to be dorsolateral PFC: developmentally the last to myelinate and the most involved in executive function. Primarization could be mediated by short-cuts to lower levels of cortical hierarchy, such as arcuate fasciculus and spindle neurons, with their far-reaching axons.

Basal ganglia: subcortical modulator of attention.

While thalamus seems to be a relatively neutral mediator of competition for conscious attention, basal ganglia implements conditioning, which actively directs focus. Phasic dopamine in basal ganglia also indicates “reward prediction error”, and variation in sensitivity to dopamine is a risk factor for ADHD.

For example, ADHD is correlated with 7-repeat allele of DRD4, which accelerates reuptake. Even more important might be variation in COMT gene: Met 158 allele, which degrades postsynaptic dopamine 4x slower than Val 158 allele, is associated with better working memory, but slower attention switching. Basically, it enhances top-down or goal-directed attention vs. bottom-up or novelty-oriented attention. ADHD is treated by norepinephrine and dopamine agonists or reuptake inhibitors, such as Bupropion.

This differs between hemispheres: "To advance our understanding of ADHD and medication effects we draw upon the evidence for (1) a neurotransmitter imbalance between norepinephrine and dopamine in attention-deficit hyperactivity disorder and (2) an asymmetric neural control system that links the dopaminergic pathways to left hemispheric processing and links the noradrenergic pathways to right hemispheric processing. It appears that attention-deficit hyperactivity disorder may involve a bi-hemispheric dysfunction characterized by reduced dopaminergic and excessive noradrenergic functioning. In turn, favorable medication effects may be mediated by restoration in neurotransmitter balance and by increased control over the allocation of attentional resources between hemispheres".

On a cellular level, temporal attention span is inversely proportional to the "decay rate" for stimuli propagating from primary into association areas of neocortex. Passive decay is caused by charge dissipation across neuronal membrane and reuptake of excitatory neurotransmitters at the synapses. Such decay promotes relatively novel stimuli. On the other hand, active suppression by neurons that represent competing stimuli, via inhibitory interneurons and neurotransmitters, should promote relatively recurrent or concurrent stimuli. Longer term, slower passive decay would correspond to longer connections and competition among more distant and persistent stimuli.

Other factors affecting stimuli decay rate are axonal straightness and myelination, structural trade-offs within cortical minicolumns and thalamus (see “Cortical Trade-Offs“ post), and so on. A developmental possibility is that high levels of cortisol / low levels of serotonin increase the levels of phasic dopamine, which in turn accelerates dopamine reuptake. ADHD sufferers have fewer dopamine autoreceptors, leading to greater fluctuations in its levels and increased novelty seeking to keep the cortex busy.

The degree of preference for novelty in the immediate environment also depends on recent intensity of value-loaded stimuli, modulated by our subjective sensitivity to the latter. Sensitivity is increased by deprivation (vs addiction) for positive stimuli, and security (vs vulnerability) for the negative ones. Particularly during formative years, attention span can be increased by broad intellectual exposure, if combined with weak visceral pressures and temptations.

1/7/12

Comments from "Cognitive Focus" knol

Derek Zahn on my personal knol:

>"Given limited resources, there must be a trade-off between the number & the length of connections in such network..."
>I don't understand what you mean by "length" here... It seems that the topology etc of the network would be the important properties, not physical measurements...

Here I assume that innate topology of neocortex is roughly the same, - genetic variation among individuals is very minor. On the other hand, the “dense vs sparse” bias (genetic or perinatal) requires very little information, see “Developmental factors” section. Of course, adult topology is largely acquired, but acquisition process itself is affected by innate biases.

By “length” (of axons) I mean average distance between connected nodes (minicolumns?), in whatever topology. Given a fixed total length of connections (resources), greater average length of individual one-to-one connection must come at the cost of smaller total number of these connections. Think of spindle neurons, - very few but very long connections. So, this would produce sparser network with longer-range & more selective associations (concepts). Selection itself is probably through some variation of Hebbian “fire together- wire together”.

Let’s face it, the brain is physical, its resources are limited, there are trade-offs to be made. Again, this knol is on gross neural bias only, I deal with algorithmic level (not necessarily neuromorphic) on my “Intelligence” knol.

Todor Armaudov:

Cognitive abilities peaks. Saturation of learning and Generalization novelty seeking.

Hi Boris,

I'm not ready to discuss on neurological stuff, but I could on the years of peak of cognitive abilities.

I think some of the abilities are "flat" or at least could be "emulated" without deep generalization, and their peak might be more likely dependent on social status, aim at power and focus rather than special generalization shift. Science has also a social status bug, because usually researchers are supposed and they do accept to serve their master's directions until 30s (PhD, post-doc ...)

Language and stories maybe don't have that deep hierarchy or so, I don't know, but I think gifted writers and poets may reach to high or "perfect" skills as early as their 20s or even teenage years. ("Perfect" means there's not much more where to go in style and how to tell a story interestingly.)

This can happen even without reading lots of sample fiction. Acknowledgement or time-span needed to write influential works may take decades, though.

Also, you call art "fluff", but I believe talent to write stories includes a good deal of generalization.
I think art is an imitation of algorithms; the worst authors copy data, the talented and original ones induce and understand algorithms (patterns) that could generate plausible data, their algorithms are more robust and are harder to reverse-engineer given only the artwork.

(I agree that for writing literature critics, reading lots of books through many years is helpful, though.)

Maybe I'm just an exception, but was authoring pretty high generality stuff at age of 17-18-19 such as philosophy (including "my theory"); [science] fiction and fantasy with philosophical elements; was doing language engineering (lexical and semantic enrichment of Bulgarian), "genre and style" engineering, and solid sociolinguistics research.

It was a peak, and I have explanation why it declined in the following years: saturation & distractors. :)

There's a phenomenon I call exhaustion or saturation of learning. Saturation is not only cognitive (e.g. you don't get social support for the activities and give up), but there is a crucial cognitive part which is related to boredom and the conditions where the cognitive algorithm should skip too predictive patterns. That's a form of novelty seeking, and I think it contributes to shift to higher generality concepts after lower ones are saturated.

When mind extracts patterns from a given domain (set of raw data/patterns), initially it does fast and improves quickly. This can be either at same level of generalization (reaching high predictability & precision) and multi-level - increasingly abstract generalizations are discovered. However the process slows down in both directions, eventually at the highest level of generalization discovered. Mind cannot find higher level of generalization, gets bored and tends to switch to new domains in order to find:

- more unpredictable/complex patterns, starting from lowest level
- steeper function of generality increase (until another saturation)

I'd call this "Generalization novelty seeking"

I suspect persons who have higher tendency to search for inter-domain generality and the fast learners don't freeze in one single domain because they feel such saturation of generalization.

The general knowledge gotten is reused between domains and makes learning of new domains faster. Eventually domains run-out and merge, and this is accelerated by the inter-domain generalizations showing that different things are the same thing with different names.

After inter-domain saturation mind has no choice but to concentrate on higher concepts from the now merged domains, that seemed saturated before, and try to generalize further. Otherwise it would just be bored to death... :)

The not-that-inter-domain learners tend to focus to make one or a few narrow domains "perfect". They don't care or don't notice that most of the time the progress is very slow or none, they're reaching precision and generalization limits and doing the same thing over and over again with no improve.

Last edited Jul 2, 2010 5:06 PM