The Making of Indian Statistics

How Mahalanobis Counted India's Aspirations.

In June 1933, a new statistics journal, bearing the name Sankhyā, was founded by Prasanta Chandra Mahalanobis [PCM] in Calcutta. The name was deliberate as the editor’s note explained in the first issue: “As we interpret it, the fundamental aim of statistics is to give determinate and adequate knowledge of reality with the help of numbers and numerical analysis. The ancient Indian word Sankhyā embodies the same idea.” Sankhyā is Sanskrit for number – but also, by philosophical extension, for adequate knowledge. This was the program that Mahalanobis was to champion.

The first five volumes illustrated an ambitious and independent agenda comprising forty theoretical and eighty-seven applied papers. These articles were a formalization of the newly founded Indian Statistical Institute’s (ISI) mission: to remove guesswork and produce precise estimates for hard-to-track problems in a fragmented, still-developing country. Where previously questions such as ‘how much jute is really grown in Bengal’ and ‘what is the yield per acre in any given season’ would have been answered by the subjective patwari system (of village revenue officials looking at a field and providing an estimate by eye), this was replaced with techniques like stratified random sampling and crop-cutting experiments.¹

“It is both the duty and the pleasure of the editors of a statistical journal on the verge of its centenary to offer a very hearty welcome to Prof. P.C. Mahalanobis and his colleagues who have launched Sankhya and its first part (June 1933) reflects great credit on all responsible… The editorial committee have set themselves a high standard. Their colleagues in London will watch the progress of Sankhya with hopeful interest.”

- Journal of the Royal Statistical Society

1930

2025

The Making of Indian Statistics

But the ISI did more than apply existing statistical methods to Indian problems – it developed new ones. Mahalanobis' most distinctive methodological innovation was high-quality interpretation of a network of sub-samples: instead of drawing a single sample from a population, you draw two independent sub-samples processed by separate field teams, so that any divergence between results immediately flags a non-sampling error. The resulting estimates could then be compared. Sampling theory tells you how far apart two such estimates should fall by chance alone and any divergence beyond that benchmark has to come from somewhere else – investigator bias, recording errors, inconsistent application of instructions. This converted non-sampling error into a quantity that could be measured, traced back, and corrected between rounds.² Harold Hotelling, then among the foremost statisticians in the world, eventually wrote that “no technique of random sample has, so far as I can find, been developed in the United States or elsewhere, which can compare in accuracy with that described by Professor Mahalanobis.”

This is the story of how India built, then lost, and is slowly regaining the infrastructure of national self-knowledge. The ISI under Mahalanobis was, for a quarter century, one of the most productive research institutions in the developing world – a place that attracted first-rate mathematical talent, developed novel survey methods, and generated the data on which an entire nation's economic planning depended. It was also, by design, a fragile institution dependent on a unique set of circumstances. When those conditions disappeared, the system disappeared too, with consequences that are only now becoming fully clear. How did a laboratory with a first-year budget of ₹238($2.5)³produce all of this to begin with? And why couldn't it last?

Statistician by chance

Mahalanobis graduated from Cambridge University in 1915, where he read physics. He visited India, intending to return to his alma mater for a research project with C.T.R Wilson at the Cavendish Laboratory, “but he did not go back as he found a number of problems in India that could engage his attention.” As he was planning for this fateful trip back home, his contemporary C. R. Rao recounts, “The first World War was on and there was a short delay in his journey. Mahalanobis utilized this time browsing in the King’s College Library. One morning Macaulay, the tutor....drew his attention to some bound volumes of Biometrika.”⁴ This was Karl Pearson’s⁵ journal, which, since its establishment in 1901, was the principal venue for the new science of statistical inference. Mahalanobis read the volumes on the boat to India and was changed by them – a physicist, by training and vocation, had serendipitously found his calling.

1930

2025

Statistician by chance

Initially, he took up a post teaching physics at Presidency College in Calcutta. While statistics remained a side project of his, by then he had already encountered the man who would direct that vocation. Brajendra Nath Seal, professor of philosophy at Calcutta University, had delivered a lecture at the Universal Races Congress in London in 1911 titled ‘The Meaning of Race, Tribe, Nation’ which prompted Mahalanobis’ thinking about the statistical measurement of distance between populations.⁶

Shortly after, in the early 1920’s, Mahalanobis had the chance to develop related statistical methodology when Nelson Annandale, then director of the Zoological Survey of India, asked him to analyze anthropometric measurements of Anglo-Indians. This work resulted in the Mahalanobis Distance (or D² as it is known formally), a way of calculating how far any given observation lies from the center of a multivariate distribution while accounting for correlations between variables. Such work brought Mahalanobis to the attention of government agencies, who began commissioning him for survey work.

The Shah Jahan of statistics

Gradually, as partnerships with the government expanded in scope and more research assistants were brought on, Mahalanobis had to find a real space for his statistical work. Thus emerged the Statistical Laboratory, “a flattering name bestowed upon a narrow space partitioned by cupboards in the room adjoining his in the Physics Department.”⁷ Even as the lab grew larger and formally became the Indian Statistical Institute in 1931, it retained a structural ambiguity that turned out to be its greatest asset. The ISI was, as Mahalanobis himself put it, “in the public sector, but not a Government department.”⁸ This meant he could hire specialists directly rather than receiving whichever generalist the civil service posted to him; he could keep them on staff permanently rather than losing them to the next transfer cycle; and he could set his own research agenda alongside the survey work he contracted from provincial governments.⁹ It also meant the ISI could publish openly rather than just producing reports for its masters. The institute ran Sankhya, an international peer-reviewed journal read by Fisher¹⁰ and Hotelling, while simultaneously producing crop estimates for the Government of Bengal. That combination of applied government work and theoretical research under one roof, each feeding the other, couldn’t reasonably be replicated by a purely governmental body – and it was what drew talent from across the country and, eventually, from abroad.

There are many research institutes in India where the scientific staff are obliged to work on problems set by the authorities, research workers, especially young ones having no choice. Things were totally different in the Institute. So it was not really a joke but an unalloyed truth that he [PCM] uttered when he said: ‘Why should I pay you, it is you who ought to pay me for all the facilities and the opportunity that I am offering you.’¹¹

Another reason ISI was set up for success was that Mahalanobis correctly indexed on high-quality talent and often did the recruiting himself. Ashok Rudra, who joined in 1953, later reflected that he was “not to have to apply, appear before interview boards and get rejected," and that this was “common to most young people who were recruited by PCM himself in those days.”¹² These hires included R.C. Bose, who arrived in 1934 – a pure mathematician whose first major success at the Institute was working out the distribution of the classical D² statistic alongside S.N. Roy. Bose would later become one of three “Euler’s Spoilers” for disproving a 177-year-old conjecture on combinatorial designs. C.R. Rao joined in 1941 and became one of the most cited statisticians of the twentieth century during his tenure. He developed several eponymous techniques (including the Rao-Blackwell theorem) and acted as the ISI’s director after Mahalanobis’ death. Once inside, this group of people encountered little hierarchical overhead. For example, when Rudra protested that he was too junior for a role, Mahalanobis told him, “Junior and senior are irrelevant matters. In the Institute, we do not pay any attention either to degrees or to age. Whoever can do a job will be given responsibility for the job.” ¹³

Unusually flexible and talent-dense, the workings of the institute were not just limited to Calcutta. For several months each year, researchers met in a small Bihar hill town called Giridih, where the ISI had established an outpost in 1942 as an evacuation measure during WWII.¹⁴ But Giridih quickly became more than a backup site, and the surrounding farmlands were used for crop-cutting experiments, testing alternative plot sizes and shapes for yield estimation. And every summer and autumn, Mahalanobis would decamp to Giridih with his full court: R.C. Bose, S.N. Roy, S.S. Bose, K.R. Nair, and a retinue of assistants. Rudra likened this to “Emperor Shah Jahan moving between Agra and Delhi, or like the Viceroy of India moving between Delhi and Simla,” but notes that these trips were anything but leisurely. If in Calcutta it was normal for him to be in the laboratory until nine or ten in the evening, his demands in Bihar were more exacting still. The culture of this flexible and productive institute was closer to that of an intellectual court in its seasonal residence.

To take this analogy just a little bit further, the ISI drew its cabinet from the final flourish of the Bengal Renaissance – a world in which a statistician was close friends with Rabindranath Tagore (who wrote the foreword for the second volume of Sankhya in 1935).¹⁵ The relationship between the Poet and the Professor¹⁶ was consequential because it allowed PCM to become acquainted with Jawaharlal Nehru. He “[Nehru] had met the Professor socially on several occasions, usually when calling on Rabindranath Tagore…The Professor was present, for example, when Nehru visited the poet in 1939 to ask if his creation, Jana Gana Mana, could be used as the national anthem when freedom arrived.” Later, in 1940, Mahalanobis visited Nehru at Anand Bhavan in Allahabad, where they stayed up past two in the morning discussing India's economic future.¹⁷

The idea of planned economic development pervaded 1930s Indian nationalism – Visvesvaraya's Planned Economy for India had appeared in 1934, the National Planning Committee [NPC] was formed in 1938, the Bombay Plan would follow in 1944 – but none of these efforts had a credible method for generating the data such a plan required.¹⁸ Moreover, as chairman of the NPC from 1938, Nehru had been confronted with what he called “the absence of accurate data and statistics” and the quality of available information was a ubiquitous complaint.¹⁹ Mahalanobis and the ISI helped fill this gap: he argued that the data a planned economy needed could only come from large-scale random sample surveys, because administrative records were unreliable and complete enumeration too slow and expensive for a country of India's size.²⁰ After independence, these ideas found institutional expression, and in early 1949, Nehru recommended that Mahalanobis be appointed Honorary Statistical Adviser to the Cabinet.

But Mahalanobis the planner was quite different from Mahalanobis the statistician. He was the chief architect of the Second Five-Year Plan, and his design — based on what we now call the Feldman-Mahalanobis model — channeled investment overwhelmingly toward heavy industry: steel, machine tools, capital goods. The model also assumed a near-closed economy and dramatically underestimated the foreign exchange costs of import substitution. B.R. Shenoy correctly predicted a balance-of-payments crisis, inflation, and a thicket of controls. When Milton Friedman visited India in the early 1960s, he observed that per capita food grain availability had not risen since 1950, that cloth consumption was no higher than in 1939, and that the poorest third of the population had experienced no increase whatsoever in food consumption during the decade of the first two plans.²¹ Mahalanobis had himself confessed to his colleague Pitambar Pant in 1954: "I had only very vague ideas of planning when I first came to Delhi."²²

The National Sample Survey

Even as Mahalanobis was drawn into planning, his more lasting contribution was already underway. For all the theoretical advances that had emerged from the ISI, India in 1950 still lacked the most basic empirical knowledge of how its own population lived. What do the people of the nation eat? How much do they earn? Are they employed? Instead, policy was based on aggregate agricultural statistics (themselves unreliable), provincial tax records, and the impressionistic reports of district officers.

One solution here was the Soviet-style apparatus (through the Gosplan). In fact, Mahalanobis even took a trip to the USSR to better understand their planning-monitoring system, which relied on administrative reporting such as factory output figures and collective farm data flowing upward through state mechanisms.²³ India, however, had no comparable administrative infrastructure. Instead, the National Sample Survey [NSS], founded in 1950, took the opposite approach: large-scale probability sampling with methodological innovations like interpenetrating subsamples, designed to capture an informal, heterogeneous economy and oriented toward estimating consumption, poverty, and welfare distribution at the population level.²⁴

The NSS operated in “rounds,” with each round a nationwide sweep of a stratified random sample of rural and urban localities, in which investigators knocked on doors and asked a random set of houses a set of questions. Every household was asked to list, item by item, what it had consumed over the previous thirty days – rice, dal, vegetables, cooking oil, fuel, clothing, and all the small expenditures that make up a life. From these lists, aggregated across thousands of households, it became possible to estimate, for the first time, what a typical Indian household actually consumed. More importantly, this data could be used to determine the cost of lifting that household to any given standard of living. This is where the poverty line comes from.²⁵

There was now data that could be used for the government’s Five-Year Plans, to set employment targets and investment allocation.²⁶ It was used to estimate the scale of rural unemployment, to track the effects of the Green Revolution on different income groups, and to monitor whether the benefits of growth were reaching the bottom of the distribution. Angus Deaton, whose Nobel-winning work on consumption and poverty drew extensively on this data, has written that these “were the world's first system of household surveys to apply the principles of random sampling established in the 1920s and 1930s.”

Building a non-human computer

There was, however, a problem. Running surveys at this scale required computation, and in 1950, this word meant rooms full of human “computers” performing arithmetic by hand. Mahalanobis knew that as his statistical work grew, he would need an electronic computer.

But he did not make it easy for himself to acquire one. The Cold War was at its zenith, and Mahalanobis was not coy about his Soviet sympathies. In 1953, he wrote a letter describing how “stunned” he felt by “the passing away of the great Stalin.”²⁷ This was not an isolated sentiment. The Indian government's entire economic philosophy tilted toward Soviet-style central planning, and Mahalanobis’ own staff included dozens of members from the Communist Party of India.²⁸ The United States government drew the obvious conclusion and blocked the sale of American computers to India.

1930

2025

Building a non-human computer

The ISI improvised, and in 1953, an engineer named Samarendra Kumar Mitra, working in the ISI's Electronics Division, assembled what is now recognized as India's first electronic computer. “It was a small analogue machine capable of solving simultaneous linear equations of up to ten variables. Lacking the resources to import expensive parts, they built it using materials salvaged from the Second World War surplus in the scrap markets and disposal depots of Calcutta’s Chandni Chowk market.”²⁹ An excess of parts from the Allied demobilization, including military electronics, had washed up in India's informal markets, and so a country that could not buy computing technology from global superpowers built its own from the debris of wars.³⁰

The present day

Shruti Rajagopalan has pointed out that the Indian planning apparatus was the product of two distinct forces – the socialist ideological current within the freedom movement, and the scientists and technicians who actually built the machinery. Crudely, one can say that Nehru was responsible for the ideological vision while Mahalanobis dealt with the mathematical and statistical apparatus.

This alignment had intellectual roots in what Raghabendra Chattopadhyay, in his study of Indian planning, calls Nehru's “scientism,” or the conviction that correct scientific technique could render political questions of class and redistribution redundant: “Nehru, as the chairman of the Committee, was proud of its ‘quiet’ and ‘expert’ work, and advised that the ‘political’ element should be kept outside the Committee. Planning, to him, was an economic process, the result only of the quiet and expert work of scientists, economists, and industrialists.”³¹ There is a sense of Fabian inheritance here; the idea that social investigation in the form of counting, measuring, and enumerating was the precondition for rational reform.³² The ISI under Mahalanobis was the expression of this conviction. But it also meant that the institution’s position depended on a particular theory of governance that would struggle to outlive its originators.

1930

2025

The present day

Mahalanobis understood this to a certain extent, as can be seen in a letter he wrote to C. D. Deshmukh, then president of the ISI: India needed an institute that was “outside Government but which would take up, by agreement with Government, such scientific work or research for promotion of science, or the coordination of scientific activities as can be done conveniently and efficiently by a non-official agency.”³³ Such a structure, unless codified, would not work without the presence of a strong individual, one who powered through with sheer force of personality, political relationships, and a deliberate willingness to flout bureaucratic rules he considered meaningless. The cracks appeared early, as Nehru's death in 1964 removed a key political patron. The Statutory Committee for 1964-65 found that the ISI’s budget preparation methods were “inadequate” and that important details were “lacking.”³⁴ There were other probes into the ISI, and review committees took an excessively harsh view of the institute in light of PCM’s unorthodox managerial style.³⁵

Slowly, ISI lost the systems responsible for its initial success. This decline was not instantaneous, and C.R. Rao, who continued as ISI director after Mahalanobis’ death in 1972, remained one of the most productive statisticians in the world – his contributions to estimation theory, information geometry, and multivariate analysis continued for decades. Yet without Mahalanobis’ political access and calculated rule-breaking, the ISI gradually formalised into what he had spent his career preventing it from becoming: an ordinary government institution, subject to the usual bureaucratic rhythms of transfers, committees, and underfunding.³⁶

Pramit Bhattacharya, in his account of this trajectory, identifies four reinforcing causes of this: the absence of an apex statistical authority (“Mahalanobis’ scientific achievements, his global stature, and his unique position in the Nehru cabinet ensured that he could act as a one-man statistical commission”); chronic underinvestment in computing infrastructure; the waning influence of technocrats in Indian policymaking as the Nehruvian premium on scientific expertise gave away; and the lack of feedback loops. The first three can be equated with lessening governmental interest and support, but the last is indicative of institutional malaise. Since data produced by the statistical system was consumed mainly by planners and academics, there was no broad constituency to notice or complain when quality declined.

The Rangarajan Commission, set up in 2000 under the former RBI Governor C. Rangarajan, identified this vacuum. In its 2001 report, it recommended the creation of a permanent, statutory National Commission on Statistics to serve as the core body for all primary statistical activities – a centralized endeavor that would monitor and enforce statistical priorities and standards, and ensure coordination among the many agencies involved. However, implementation here was half-hearted and the consequences of this partial reform became dramatically visible in 2019. The National Sample Survey Organisation had conducted its 75th round Consumer Expenditure Survey in 2017-18, the first such survey in six years, and one that economists had been waiting for with real anticipation (since consumption expenditure data is the basis for India’s poverty estimates). But when the results showed that consumer spending had actually declined for the first time in over four decades, the government suppressed the report. It was leaked to the press, then formally junked.

There was also the persistent problem of delayed survey rounds, and the gap between consumption expenditure surveys had stretched to over a decade. Additionally, the NSSO was absorbed into a new National Statistical Office in 2019. When the next Household Consumption Expenditure Survey (HCES) was finally released in 2024, covering 2022-23, it arrived with a fundamental problem: it was not comparable with any previous NSS round because the sampling design, the recall periods, and the structure of data collection itself had changed.³⁷ It may perhaps be true that the new methodology is superior, but the practical effect was the severing of the only long-running consumption time series that India had – the one Mahalanobis’ NSS had been building since 1950.

This non-comparability had immediate downstream consequences. When the World Bank published updated poverty estimates for India in June 2025, drawing on the HCES 2022-23 data, the economist Himanshu laid out in detail why the numbers were unreliable. The Bank had constructed a novel “welfare aggregate” that applied only to India, which replaced the standard consumption expenditure measure. He also observed that there had been “a steady erosion in the reputation India has enjoyed for its approach to poverty measurement.”

Where do we go next?

It is easy to be disappointed by the unfortunate arc of this story – one that moves from Indian Exceptionalism and the beginnings of a rich intellectual tradition towards a state-of-affairs that seems like the status quo. But the truth is that it is extraordinarily difficult to build functional institutions that do equally well on the efficiency and longevity axes. Mahalanobis built a statistical system that, for a quarter century, was among the most rigorous in the developing world. It produced the methods, the data, and the talent that let a newly independent country of 350 million people begin to see itself clearly for the first time. However, the apparatus relied on him, on Nehru’s conviction that governance should be scientific, and on an institutional hybrid that was never codified concretely. The legacy of Mahalanobis and the current state of affairs with the ISI lay credence to the argument that goodwill and personal genius are not durable institutional foundations because they lack guarantees of continuity as governments change.

Furthermore, when cracks appear, they are not immediately tragic. There is no single moment of visible failure, no crisis that forces a response. Instead, survey rounds are delayed by a year, then by a decade. A questionnaire is redesigned. A consumption series that had run since 1950 is severed. It is only over time that the consequences of these decisions become clear.

Today, the most visible failures have begun to be repaired. India is finally conducting its first census since 2011 – House-listing for the 16th Census began in April 2026, with population enumeration scheduled for February 2027, closing a 16 year gap. At the statistics ministry, Saurabh Garg, installed as its senior-most official in 2024, has cleared the survey backlog, instituted a published calendar of data releases, and pushed agencies to produce output figures at the level of the country’s 800-something districts. Pramit Bhattacharya, of Data for India, told The Economist recently that the speed of these changes would have seemed impossible from where things stood in 2023. In February 2026, the National Statistical Office also launched a beta Model Context Protocol server covering twenty-one official datasets that is “designed to enable direct interaction with statistical datasets through users’ own AI-based tools and applications.”

1930

2025

Where do we go next?

The shape of the reform, however, should give pause. Garg is already past the official civil service retirement age of sixty, working on successive extensions, and as The Economist concludes: Indian governance “relies more on individual capability than established process.” This is the Mahalanobis problem in its current iteration. A talented individual with political access reorganises a sclerotic apparatus, the apparatus produces useful data again but with no guarantees of longevity and continuity – hopefully this time there are systems in place to prevent the slow degradation of the statistical machinery again. Today, the country that pioneered random sampling at scale needs to once again innovate on institutions, ideas, and methods, to continue answering the most basic question a government can ask:

Are our people better off than they were before?

Notes from Team Alter

"I think there is a really nice arc to trace here, one that builds from Indian exceptionalism and ends with a pointed critique of why we need good data today." That email from Hiya, in May 2025, is where Alter #4 began.

It took a year to get from that sentence to this page.

Statistics, it turns out, is a subject built on hope. The belief that if you measure a thing carefully enough, honestly enough, with sufficient regard for the people being measured, you can make better decisions. Mahalanobis understood this. We tried to design from this idea.

The piece opens on archival footage of him in the field, cut against an ASCII composition of D² and the operators that defined his work. The Shah Jahan of statistics, on location. What followed was a pop-art palette applied to a seemingly arid subject, and tactile paper folders, filled with sourced clippings and photographs that carry the reader through the pivotal moments of India's data provenance.

The crimson age of AI gave us tools Mahalanobis would have found quietly astonishing. We made the most of them.

If you enjoyed reading, please do share. We're at @altermagindia

TheMakingofIndianStatistics

How Mahalanobis Counted India's Aspirations.

Statistician by chance

The Shah Jahan of statistics

The National Sample Survey

Building a non-human computer

The present day

Where do we go next?

Notes from Team Alter