## Server Bug Fix: Did many programs really store years as two characters (Y2K bug)?

The claim that programs stored dates as two ASCII or similar characters because computers were limited in resources seems wrong to me because it takes more memory than one 8-bit integer would. Also in certain cases it’s also slower.

Storing the year as one 8-bit signed integer would give (assuming 1900 + year) a range of 1772 (1900 – 128) to 2027 (1900 + 127). An unsigned 8-bit integer gives 1900 to 2155 (1900 + 255). In either case the end result is a much wider range and half the storage and RAM requirements.

In addition to perform arithmetic on the years the software would need to convert the two characters back to an integer whereas if it were stored as an integer it could be loaded without conversion. This is where the slower in certain cases comes in. In addition it could make sorting by year slower.

In my view the only reason I can think of for why someone would store a year as two characters is bad programming or using a text based storage format (such as something like XML or JSON which I know those are more contemporary in comparison to programs that had the Y2K bug). Arguably you can say that choosing a text-based storage format is an example of bad programming because it’s not a good choice for a very resource limited computer.

How many programs stored years as two characters and why?

Short Answer: BCD rules over a single byte integer.

The claim that programs stored dates as two ASCII or similar characters because computers were limited in resources seems wrong to me

The point wasn’t about using ASCII or ‘similar’, using only two decimal digits. That can be two characters (not necessary ASCII) or two BCD digits in a single byte.

because it takes more memory than one 8-bit integer would.

Two BCD digits fit quite nicely in 8 bits – in fact, that’s the very reason a byte is made of 8 bit.

Also in certain cases it’s also slower.

Not really. In fact, on quite some machines using a single byte integer will be considerable slower than using a word – or as well BCD.

In addition to perform arithmetic on the years the software would need to convert the two characters back to an integer whereas if it were stored as an integer it could be loaded without conversion.

That is only true and needed if the CPU in question can not handle BCD native.

This is where the slower in certain cases comes in. In addition it could make sorting by year slower.

Why? Sorting is done usually not by year, but as a full date – which in BCD is 3 Byte.

In my view the only reason I can think of for why someone would store a year as two characters is bad programming

Are you intend to say everyone during the first 40 years of IT, up to the year 2000 were idiots?

or using a text based storage format

Now you’re coming close. Ever wondered why your terminal emulation defaults to 80 characters? Exactly, it’s the size of a punch card. And punch cards don’t store bytes or binary information but characters. One column one digit. Storage evolved from there.

And storage on mainframes was always a rare resource – or how much room do you think one can give to data when the job is to

• handle 20k transactions per hour

on a machine with a

• 0.9 MIPS CPU,
• 1.5 megabyte of RAM, some
• 2 GiB of disk storage?

That was all present to

• serve 300-500 concurrent users

Yes, that is what a mainframe in 1980 was.

Load always outgrew increased capabilities. And believe me, shaving off a byte from every date to use only YYMMDD instead of YYYYMMDD was a considerable (25%) gain.

(such as something like XML or JSON which I know those are more contemporary in comparison to programs that had the Y2K bug).

Noone would have ever thought of such bloaty formats back then.

Arguably you can say that choosing a text-based storage format is an example of bad programming because it’s not a good choice for a very resource limited computer.

Been there, done that, and it is. Storing a date in BCD results in overall higher performance.

• optimal storage (3 byte per date)
• No constant conversion from and to binary needed
• Conversion to readable (mostly EBCDIC, not ASCII) is a single machine instruction.
• Calculations can be done without converting and filling by using BCD instructions.

How many programs stored years as two characters and why?

A vast majority of mainframe programs did. Programs that are part of incredibly huge applications in worldwide networks. Chances are close to 100% that any financial transaction you do today still is done at some point by a /370ish mainframe. And these are as well the ones that not only come from punch card age and tight memory situations, but also handle BCD as native data types.

And another story from Grandpa’s vault:

For one rather large mainframe application we solved the Y2K problem without extending records, but by extending the BCD range. So the next year after 99 (for 1999 became A0 (or 2000). This worked as the decade is the topmost digit. Of course, all I/O functions had to be adjusted, but that was a lesser job. Changing data storage formats would have been a gigantic task with endless chances of bugs. Also, any date conversion would have meant to stop the live system maybe for days (there were billions of records to convert) — something not possible for a mission critical system that needs to run 24/7.

We also added a virtual conversation layer, but that did only kick in for a small number of data records for a short time during roll over.

In the end we still had to stop short before midnight MEZ and restart a minute later, as management decided that this would be a good measure to avoid roll over problems. Well, their decision. And as usual a completely useless one, as the system did run multiple time zones (almost around the world, Wladiwostock to French Guiana), so it passed multiple roll over points that night.

I programmed in Cobol for nearly 20 years at the start of my career and it was quite common to see two digit years used. Initially this was due to storage concerns – not necessarily only memory, but disk storage also.

Cobol has no intrinsic date field.

It was very common to store dates in Cobol like:

``````01 EMP-HIRE-DATE.
03 EMP-HIRE-DATE-YY    PIC 99.
03 EMP-HIRE-DATE-MM  PIC 99.
03 EMP-HIRE-DATE-DD   PIC 99.
``````

If it’s 1970, you may not think your program is going to be used for another 30 years… it was more important to save those two bytes. Multiplied out by multiple fields and millions of rows, it all adds up when storage is a premium.

As 2000 started approaching, people had to start dealing with this. I worked for a company that had software that dealt with contracts and the end dates started stretching into 2000’s in the mid-80’s. As a rookie, I got to deal with fixing that.

There were two ways. First, if it was feasible, I added

``````03 EMP-HIRE-DATE-CC  PIC 99.
``````

To every date field. It ended up looking like:

``````01 EMP-HIRE-DATE.
03 EMP-HIRE-DATE-YR.
05 EMP-HIRE-DATE-CC.   PIC 99.
05 EMP-HIRE-DATE-YY    PIC 99.
03 EMP-HIRE-DATE-MM  PIC 99.
03 EMP-HIRE-DATE-DD   PIC 99.
``````

But in Cobol, you can’t just change that if it is a record description of a file… Cobol has fixed length records, so if you insert two bytes in the middle, you have to create a new file, read all the records in the old format, move them to the new record format, and write to the new file. Then “swap” the file back to its original name. If you are dealing with huge datasets you needed lots of disk space and time to do all that. And also rebuild all the programs.

When it wasn’t feasible, we added code when printing or displaying dates.

``````IF EMP-HIRE-DATE-YY > 50
MOVE 19 TO PRINT-HIRE-DATE-CC
ELSE
MOVE 20 TO PRINT-HIRE-DATE-CC.
``````

This bought another 50 years…

To add to Raffzahn’s answer, here’s an example from a 1958 book on data processing:

… The names of authors are coded in columns 11-14. The first four consonants of the name form the code. The last two digits of the year of the reference are punched directly into columns 15-16. The journal, or other source for the reference, is coded in columns 17-20. …

(from Punched Cards Their Applications To Science And Industry. 2nd ed. Edited by Robert S. Casey, James W. Perry, Madeline M. Berry and Allen Kent. New York, Reinhold Publishing Corp; 1958; emphasis mine. The card image shows `52`, assumed to mean 1952, in the noted columns)

This book barely touches on computer applications, since unit record usage was more prevalent at the time. Computers inherited these tabular data formats, and the physical inertia of the data format — cardboard boxes containing millions of records — meant that tabular years stuck around for a very long time. The book does have a couple of mentions of dealing with centuries, but the typical outlook can be summarized by the quotation from p.326The century is neglected inasmuch as most entries for this particular file are in the 20th century”. Would you spend/waste 2½% of your available fields in a record just to encode something you (probably) won’t use?

Data, especially government data, sticks around for a long time. Consider this: the last person to receive a US Civil War pension died in May 2020. The US Civil War ended in 1865. Tabulating equipment didn’t come in until the 1880s. Bytes didn’t happen until 1956. JSON was this century. So data formats tend to be what the past gives you.

There are actually still RTC chips on the market which store the year only as a pair of BCD digits, and do not have a century field. These are usually packed into one byte when read, in common with the fields for day-of-month, month-of-year, hour, minute, and second. The same format is used for the alarm settings.

Old software using these chips would simply prefix the year digits with 19 for display. Current systems using them would prefix 20. The problem rears its ugly head whenever the system is used in a century that it wasn’t designed for – so potentially also in the year 2100.

If an RTC chip existed that counted in binary instead of BCD, then it could handle years in two and a half centuries without trouble. An old machine adding 1900 to the stored year would not see any rollover trouble until after the year 2155. This would however require the host system to convert binary to decimal for display – not a problem for a PC, a minicomputer, or even an 8-bit microcomputer, but potentially a burden for extremely lightweight embedded systems.

Another point that hasn’t yet been mentioned is that many programs would store data in whatever format it was keyed in. If data was stored as six digits YYMMDD, that’s what data-entry clerks would be expected to type. For a program to store data as YYYYMMDD, it would be necessary to either have clerks enter an extra two digits per record, or have software compute an extra two digits based upon the two digits of the year the person typed. Keeping information as typed, but using modular arithmetic so that 01-99=02 wouldn’t necessarily be any more difficult than having to add extra logic at the data entry stage.

I worked with IBM minicomputers (S/36, S/38, and AS/400) in the 1980s and 1990s. The very ugly programming language RPG was popular. It looked very different from Cobol but had similar abilities. It supported only two data types: fixed length alphanumeric and fixed precision decimal numeric. The numeric variables could be zoned which meant one digit per byte or packed which was two digits per byte. Storage of dates varied quite a lot but the most common was as 6 digit numbers rather than separate year, month, and day. Usually, but not always, these would be YYMMDD in the database so that the sorting would be sensible.

Numeric fields were always signed so the last half of the last byte was reserved for the sign (hex F for + and hex D for -). So, a 6 digit number required 4 bytes rather than 3 and the first half of the first byte was unused. This allowed a trick when year 2000 became an issue: the unused first half of the first byte could be used for a century flag: 0 for 19xx and 1 for 20xx. The format is usually described as CYYMMDD. Systems which use this format will have a 2100 or 2900 problem but the culprits won’t be around to see it. Actually, many systems will have a problem sooner as 6 digit format is still popular for display and entry. Arbitrary cut off dates, e.g. 40, are used to guess the century when 6 digits are entered. So, 010195 is assumed to be 1995 but 010115 is assumed to be 2015. So, a problem is due in next couple of decades for some but it won’t be as bad as the year 2000 since at least the database can cope beyond the limit.

On BCD and performance, BCD is not necessarily much or any slower since machines intended for business use usually have hardware support. Also, conversion to and from human readable format is simpler. Many business programs will display and accept entry of data far more often than perform arithmetic on it so even if there was an arithmetic performance impact it would typically be outweighed by the I/O gains.

Not relevant to dates but BCD is useful for money. Try adding 0.01 up 100 times in a float or double and then comparing it to 1.00. BCD gets this right but float and double don’t. Accountants get very upset if the balance sheet is off even by a cent.

The mainframes have been covered. Let’s look at what led up to the PC which of course is where a lot of business software evolved.

Many PC programmers had previously been programmers for 8-bit systems, and had grown up on the “every cycle counts” mentality. Many programs for MS-DOS had ports for older 8-bit systems, too, so would have often followed the same internal formats.

• Most 8-bit CPUs (and also the 16-bit CPU used in IBM PCs) had specific BCD instructions, which operated on one pair of decimal digits.
• Most 8-bit CPUs did not have multipy or divide (or mod) instructions at all! To convert from binary to base 10 and back was a lot of work and involved loops or tables. OK, you could use two BCD bytes for a 4-digit number, but if you wanted to, e.g. subtract one date from another it would have required maybe five times as many assembly instructions/cycles.
• In terms of RAM, having each date take an extra byte or two is one thing—having 5-10 extra instructions to deal with it was not worth it when you have 64KB of RAM or less. Consider that even by just typing the number “86” the program would have to add 1900 to it.
• Assembly language instructions don’t just take up space—assembly language programming is hard! Every instruction took up a line of screen space, had potential for a bug, and consumed a bit of the programmer’s headspace. (Everyone was confident at working with BCD; it has now fallen by the wayside because of high-level languages.)

While we’re on the topic of screen space, many programs didn’t even display a leading “19” on the screen. (Hello—40 character screen width!) So there was hardly any point at all for the program to keep track of that internally.

Back in olden times (1960s and earlier), when data came on Hollerith cards (80 columns, 12 rows), using 2 extra columns (of the 80 available) to store the fixed string “19” as part of a date seemed useless.

Having more than 80 columns of data meant that one had to use multiple cards, and lost several columns to the overhead needed to say “I’m with him”, plus the extra programming effort to validate the pairing.

Hardly anybody thought their code would still be in use when “19” changed to “20”.

Anybody who stores fixed field size dates as “yyyymmdd” has a Y10K problem pending.

• In banking, many computations have historically used binary-coded decimal (BCD) as a compromise between space, speed, and consistent rounding rules. Floating point was uncommon, slow, and difficult to round off consistently. So even things like interest rates might be stored in fixed-point BCD. Something like an interest rate might be stored as four BCD digits, in two byes, with an implicit decimal point after the first digit. Thus, your account could pay interest rates from 0.000% to 9.999%, which seemed perfectly sufficient, because a bank would never pay 10% or more. This thinking left 1970s programmers scrambling to update data structures when high inflation and new types of accounts like Certificates of Deposit crossed that threshold.

• When gas prices first cross the \$1/gallon threshold, many of the pumps could not be set with the actual value. Though that was largely a mechanical limitation rather than what we would consider a computing problem today, the thinking that led to the situation was largely the same: reduce costs by not bothering to enable values that were unimaginably large at design time.

• [Apocryphal] When the first 9-1-1 emergency dispatch service was put into use in Manhattan, legend says its database schema hadn’t anticipated five-digit apartment numbers, leading to delays of paramedics reaching sick or injured high-rise dwellers.

• In the 1990s, phone numbers in North America Number Plan were all of the form (AAA) BBB-CCCC, and the second A was always a 0 or 1, and the first B could not be a 1. (There were other restrictions, too, but they aren’t relevant here.) By exploiting the well-known restrictions, the contact management software on your PC could represent a phone number with a 32-bit value, saving disc space and RAM (at the run time cost of a little bit swizzling to encode and decode the values). But the soaring popularity of fax machines, pagers, modems, and cell phones drove demand for phone numbers until North America ran out of area codes, causing software publishers to scramble to update their software to use less elegant space-saving tricks.

• It’s not always about memory savings or binary-coded decimal. Sometimes it’s about space on a form or a display. In high-end software for stock trading, display space is at a premium. When it first seemed the DOW Jones Industrials Average could potentially break \$10,000.00, there was concern about whether these displays would show the additional digit, and whether there was room for a thousands separator. If the values were truncated or displayed in other confusing ways, would it slow down the traders who relied on these system? Could it trigger a DOW10K panic sell off if it confused them into thinking the value had plummeted?

# Microsoft Word did

I distinctly remember looking into the binary .doc format used by Microsoft Word 5.0 for DOS, because documents saved when the year was after 1999 couldn’t be opened again. It turned out that the internal meta-data included the date using two ASCII digits for the year and 2000 was blindly stored as `100`, thus overwriting one byte of an adjacent meta-data field, breaking the document.

The NORAD Two Line Element format remains in wide use for cataloging satellite orbits. Originally intended for punched cards, it uses a two character year. Years <57 are defined to be in the 21st century. It may be supplanted soon, though. The immediate problem is not years, but the number of trackable objects in orbit.

Even systems that stored the year as a byte often just printed “19” in front of the value. So the year 2000 (stored as 100) would be printed as 19100 or 1910 or 1900 depending on exactly how the number was converted from a byte to printable characters.

Of course a lot of systems didn’t even both printing the 19, they just printed 100/01/01 or 00/01/01. Similarly date entry systems often only accepted two digits making it impossible to enter years beyond 1999.

You may not remember the enormous market share that IBM occupied in the business computing space in the 20th century. I don’t mean just the mainframe market where S/360, S/370, S/390 & their successors play, but also the much larger (in terms of units) midrange/small business market occupied by the S/3, S/32, S/34, S/36, S/38, AS/400, and their successors. IBM’s midrange systems were ubiquitous in smaller businesses and in branch offices of mega-corporations around the world, and at their peak they brought in \$14 billion annually to IBM.

To be successful in this target market, these systems had to be far more affordable than a mainframe. And since “inexpensive” disk storage cost around \$10,000 per GB in 1990, disk capacities were small by modern standards. You might only have 13 MB — yes, MEGABYTES — of total disk capacity to work with online on a S/32. That’s to serve a corporate operation with reveneues of \$250 million and up (that’s IBM’s definition of “small business”) and running accounting, billing, payroll, inventory control, manufacturing production control, and similar applications, which are all about DOLLARS and DATES.

And these were all EBCDIC machines.

So yeah, absolutely, EVERYONE used 2-digit years on these systems prior to Y2K.

Fortunately, disks got cheaper in the late 1990s, and thousands of contract programmers put out Y2K consulting shingles at about the same time. In the 3 years from 1997 to 1999, most of the mess got cleaned up. I’m sure that here and there, a dental office had billing with dates in the wrong century, or a mortgage amortization schedule showed negative dollars owed after the turn of the millenium, but the truly bad-news applications that could hurt customers on a massive scale got fixed before zero hour on December 31, 1999, 23:59:59 UTC.

Many fixed record text formats (with or without COBOL copybook definitions) used two digit years, and some do still in this year. So it does not only affect storage format (like in databases) but also interchange formats. Those formats deliberately did not use binary representations but did try to be compact.

This happened way into the 90ies, where everybody knew about the upcoming millennial change.

Funny side note, many systems used the year 2020 as cutoff, which results in a increased problem rate this year, again.

Even in standardized EDI Formats like X.12 or EDIFACT you see the 4 digits years optional. The Date or Time Period Format Code 2379 for example starts with code “`2`” being `DDMMYY`. (And back then I was a really bad idea to allow more then one unique format for each instant type). One of the major drivers for switching EDIFACT syntax version from 3 to 4 in 1998 was the 8 digit date stamps in UNB service segment.

Another example would be the VDA fixed record messages used in the (German) automotive sector, things like VDA 4912 from 1991 specify delivery date in 6 digits (or even shorter with week numbers). They have been in used way past 2000.

Tagged : /

## Server Bug Fix: Can anyone help poor Fblthp?

my name is Fblthp, and I am completely lost.

I don’t know the location of my home planet, and given the barriers in language and culture, I can’t tell you its name in any meaningful manner either.

The one thing I can tell is that in our base-10 mathematics, (we count both fingers in all 5 of our hands) the gravitational acceleration on the planet’s surface was just under 10 length units per time units squared.

Can you help me find my home planet, or at least narrow down my choices?

Initially, it seems that the face-value interpretation of the question can have no answer. The tag, though, poses a definite problem, which enables us to think about the problem slightly more in depth. My first thought was as follows:

Historically, the metre was defined as one ten-millionth of a quarter of the circumference of the earth, although our calculations were slightly off. Similarly, the second was historically defined as one $$86,400$$th of a solar day which is slightly longer than the sidereal day (the period of the rotation of the earth – different by one day every year because after the earth has completed a revolution it is back in the same position but having completed an ‘extra’ rotation).

Thus if (which is a big if, given the barriers in language and culture), fblthp’s countrypeople have arrived at their definitions of length and time units in similar ways to us, it implies that their length and time units are in the same relation to the circumference and period of rotation of their planet as our own. This does not however, narrow down the possibilities very much as the length of the day of a planet is highly variable. Specifically, if this were the case, it would mean that if we let $$G$$ be the gravitational constant, $$M$$ the mass of the planet, $$r_p$$ its radius, $$g$$ the gravity at its surface, $$C$$ the circumference of the earth, $$C_p$$ the circumference of their planet, $$D$$ the length of our day, $$D_p$$ the length of their day, $$m_p$$ their unit of length, and $$s_p$$ their unit of time, we have:

$$g=frac{GM}{r_p^2}=frac{10m_p}{s_p^2}=frac{10left(frac{C_p}{C}right)}{left(frac{D_p}{D}right)^2}frac{m}{s^2}$$
$$therefore D_p=Dsqrt{frac{20pi r_p^3}{GMC}}$$

Which is an expression for the length of their day in terms of the radius of their planet and its mass. Unfortunately, there’s very little physical connection (as far as I’m aware) between the period of rotation of a planet and its radius and mass, which enables us to deduce very little.

However, an even bleaker prospect opens up for fblthp if we step back even further historically:

The original historical connection between the definition of metre and second was that the metre was defined as roughly the length of a pendulum that swings from one side to the other in one second. This doesn’t add any new information though, as it’s due to the fact that $$gapproxpi^2$$, since the period of a pendulum of length $$L$$ is roughly:
$$Tapprox2pisqrt{frac{L}{g}}$$
Put another way, no matter what planet you are on, and no matter how you have defined your time unit, if you define your length unit (like we roughly did) as the length of a pendulum with a period of $$2$$ time units, then $$g$$ the surface gravity on that planet will be just under $$10$$ (i.e. $$approxpi^2$$ length units per time units squared). Because of this, if fblthp’s society has arrived at their length and time units similarly to how we did, then knowing the surface gravity of the planet in their units conveys precisely zero information about the planet itself.

The only problem with this reasoning of course, is that there is no a priori reason that fblthp’s society should define their units the same way as us. However, the OP has indicated that this is the intended solution to the puzzle.

As a more general reflection on this type of problem:

Thinking about how different societies could communicate about physical constants is interesting. Of course, the only such constants whose numerical values could be communicated straightforwardly are the dimensionless constants, such as the fine-structure constant. All dimensional quantities have to be described in proportion to some other agreed quantity (e.g. here, also see here), which of course becomes a problem of chicken-and-egg. Without doing something like telling us how many Planck lengths are in the radius of their earth, we cannot learn anything specific about their units.

This would make the situation no longer a puzzle though. For what it’s worth, the trick of recognizing the historical connection between metre and second was a good one, and therefore gripes aside I say ‘good puzzle!’.

This seems too obvious so maybe I’m missing some steganographic hint or something, but

it seems like the answer is “No, of course not”, because the length and time units could be anything at all so the actual gravitational acceleration could be anything at all.

Perhaps the fact that

what Fblthp says happens to apply to us here on earth with the SI system

is meant to be relevant somehow, but

I don’t see how (again, unless there’s some secret hint here indicating that Fblthp is actually an amnesiac earthling or something of the kind).

Tagged : / / /

## Server Bug Fix: Did the 2019 discovery of O(N log(N)) multiplication have a practical outcome?

Some time ago I’ve read this news article, Mathematicians Discover the Perfect Way to Multiply, reporting a discovery published in 2019, where Harvey and Hoeven[1] found a algorithm able to execute multiplication in $$N log N$$ steps. Compare with the $$N^2$$ we are used to while doing multiplication by hand.

That amused me, because I had no idea in Mathematics there was still open problems in basic arithmetic, something I took for granted, long ago settled knowledge, since childhood.

Now I wonder, did this discovery help, or could help, materials modeling? Did a code developed somewhere for this purpose made use of it. A downside of the new algorithm is a set up phase where you need to put the numbers in a suitable form, so this initial effort is paid only for large numbers. My impression is that in matter modeling our algorithms are more about multiplying lots of small numbers fast, instead of some big numbers, so I guess the answer is probably no. But I’m not sure.

If not, can someone explain in detail the impact of any of the multiplication algorithms scaling better than $$N^2$$, for some practical application?

[1] David Harvey, Joris van der Hoeven. Integer multiplication in time O(n log n). 2019. ⟨hal-02070778⟩

# What are the state-of-the-art algorithms for long-integer multiplication?

First let me address the point you raised about the schoolbook algorithm having $$mathcal{O}(n^2)$$ scaling, by saying that this was not the state-of-the-art algorithm used in most matter modeling software. Below I give a brief overview:

(1960) Karatsuba multiplication. $$mathcal{O}(n^{1.58})$$: Faster than naive multiplication after $$n$$ gets ~$$10^{96}$$.
(1963-2005) Toom-Cook-Knuth. $$mathcal{O}(ncdot 2^{sqrt{2log n}}log n)$$: Generalization of Karatsuba.
(1971) Schönhage-Strassen. $$mathcal{O}(nlog nloglog n)$$: Outperforms TCK after ~$$10^{10000}$$.
(2007) Fürer. $$mathcal{O}(nlog ncdot 2^{mathcal{O}(log^*n)})$$: Outperforms SS after ~$$10^{10^{18}}$$.
(2015) Harvey et al. $$mathcal{O}(nlog ncdot 2^{3log^*n})$$: Similar to Fürer’s algorithm.
(2015) Harvey et al. $$mathcal{O}(nlog ncdot 2^{2log^*n})$$: Relies on conjectures not yet proven.
(2016) Covanov-Thomé. $$mathcal{O}(nlog ncdot 2^{2log^*n})$$: Relies on (different) conjectures not yet proven.
(2018) Harvey & van der Hoeven. $$mathcal{O}(nlog ncdot 2^{2log^*n})$$: Finally proven without conjectures.
(2019) Harvey & van der Hoeven. $$mathcal{O}(nlog n)$$: The algorithm mentioned in the paper you cited.

# Which of these algorithms have practical consequences?

Schönhage-Strassen: GNU multi-precision library uses it for #s with 33,000 to 150,000 digits.
Toom-Cook: is used for intermediate-sized numbers, basically until Schönhage-Strassen is used.
Karatsuba: is a specific case of Toom-Cook: not likely used for numbers smaller than $$10^{96}$$.

# So what are the consequences of the 2019 algorithm?

Likely nothing for the calculations we typically do. Schönhage and Strassen predicted very long ago that $$mathcal{O}(nlog n)$$ would be the most efficient possible algorithm from a computational complexity point of view, and in 2019 the algorithm that achieves this predicted “lower bound” was found by Harvey and van der Hoeven. It is probably not implemented in any library, just as the 2018, 2016, 2015, and 2007 algorithms are also not implemented anywhere as far as I know. They are all beautiful mathematics papers which give theoretical scalings, but likely have no practical consequences.

Do you ever multiply together integers with 96 digits? Typically in double-precision floating point arithmetic we multiply numbers with no more than 18 digits, and in quadruple-precision arithmetic (which is indeed used in matter modeling for things like numerical derivatives in variational energy calculations, but quite rarely) numbers have up to 36 digits, but it is unlikely that anyone in matter modeling is frequently multiplying numbers with 96 digits, so even the Karatsuba algorithm is in practice worse than the schoolbook $$n^2$$ algorithm, due to Karatsuba involving extra shifts and additions as overhead. The Toom-Cook algorithms (such as Karatsuba) are useful in number theory, and in fact we use them every day when we do e-banking or when we use GitHub involving RSA keys, since RSA numbers are hundreds or thousands of digits long. The Schönhage-Strassen is used mainly in number theory for things like calculating record numbers of digits in $$pi$$, and for practical applications such as multiplying polynomials with huge coefficients.

Conclusion: The 2019 algorithm for integer multiplication does not affect real-world applications.

This $$O(nln n)$$ integer multiplication algorithm is a galactic algorithm, meaning that it won’t be used despite being “of lower complexity” because it only becomes more efficient than existing algorithms for problems vastly larger than any relevant to us in practice. The problem is big-$$O$$ notation only tells us how the algorithm behaves for sufficiently large $$n$$, whereas values of $$n$$ that will come up in practice will see a much worse behaviour. Section 5 of their paper explains:

In this section we present the main integer multiplication algorithm. We actually give a family of algorithms, parameterised by a dimension parameter $$dgeqslant2$$. Let $$n_0 := 2^{d^{12}}geqslant 2^{4096}$$, and suppose that we wish to multiply integers with $$n$$ bits. For $$n < n_0$$, we may use any convenient base-case multiplication algorithm, such as the classical $$O(n^2)$$ algorithm. For $$ngeqslant n_0$$ we will describe a recursive algorithm that reduces the problem to a collection of multiplication problems of size roughly $$n^{1/d}$$. We will show that this algorithm achieves $$M(n) = O(nlog n)$$, provided that $$dgeqslant1729$$.

In other words, it’s only worth using the new algorithm to multiply numbers with at least $$geqslant2^{1729^{12}}$$ bits. (For integer multiplication, the problem size $$n$$ is how many bits the larger integer has, not the integer itself; but even this number would have to be so large for the algorithm to be worthwhile I’ll find it useful to discuss its number of digits, in base $$10$$.) This number of bits has more than $$2times 10^{38}$$ digits in base $$10$$. A computer using every subatomic particle in the observable universe to store one bit of data can only store a number of bits of data whose number of digits is well under $$100$$. So there’s no chance anyone would ever have a machine capable of such multiplication regardless of algorithm. The paper notes that smaller problems should just be done with existing algorithms.

Why does $$1729$$ come up here? Because a $$1729$$-dimensional Fourier transform is used. I’m sure within a few years there will be a tweaked version that brings that number down, allowing smaller problems to be multiplied in $$O(nlog n)$$ time. But even if we only require $$d=2$$ so $$n_0=2^{2^{12}}$$, that’s still a number with $$1234$$ digits in base $$10$$, far more than the aforementioned $$100$$. For what it’s worth, the paper sketches a route to using $$d=8$$, in which case $$n_0$$ would have over $$2times10^{10}$$ digits.

As my link to Wikipedia notes, other kinds of multiplication have also encountered galactic algorithms, such as gradual improvements to the Coppersmith–Winograd algorithm for matrix multiplication.

To take a slight detour, we can also look at the progress of matrix multiplication algorithms. As mentioned in a few comments here, standard matrix multiplication is $$O(n^{3})$$ and any exact method for a general matrix is going to require $$O(n^{2})$$ operations just to process all the elements of initial matrices. Over the last 50 years, different methods have been developed to reduce the exponent, often denoted $$omega$$. These could in principle be very useful for matter modeling, as a number of electronic structure and molecular dynamics methods rely on matrix multiplication and matrix operations which have been shown to scale the same (determinant, inversion, Gaussian elimination) or in a way expressible in terms of $$omega$$ (eigenvalues).

The simplest such approach, and thus the most likely to be applied in practice, is the 1971 Strassen Algorithm, which has $$O(n^{log_2(7)})=O(n^{2.804…})$$ scaling. It achieves this by recursively breaking the initial matrices into 4 blocks and storing intermediate quantities such that you can perform 7, rather than the typical 8, block multiplications.

Fairly recent studies suggest that the crossover point where it becomes more efficient than standard matrix multiplication is somewhere between $$n=512$$ and $$n=1024$$ (the method works best with widths that are powers of two due the repeated splits into 4 blocks), which are not unreasonable sizes to encounter in a large molecular electronic structure calculation. In practice, the better scaling in general is traded for greater speed for specific cases by setting a threshold size below which the recursion is stopped and replaced with standard matrix multiplication. I don’t know off hand of any program that actually uses this method, but it seems like it would be simple addition and could produce tangible speedups for larger systems.

The last significant improvement was the 1990 Coopersmith-Winograd Algorithm, which scales as $$O(n^{2.376…})$$. The algorithm is much more complicated than the original Strassen algorithm; the proof of the scaling relates the rank of a particular trilinear form of tensor products to $$omega$$. This complexity manifests in a very large prefactor, making the method much slower than the Strassen method or standard matrix multiplication. The impractically large matrices needed to reach the crossover threshold for these later approaches has led them to be referred to as galactic algorithms.

These later approaches currently have no use in matter modeling (or really any practical application), but could have significance in the long run. While the current thread of research has focused on proving a lower bound for $$omega$$, this work could provide the impetus for producing more practical algorithms by proving that better scaling than the standard algorithm is possible.

Can someone explain in detail the impact of any of the multiplication algorithms scaling better than N2, for some practical application?

An actual application is right in front of our eyes: digital signature using RSA. If I click on the lock icon for the present page in my browser, then on the arrow on the right of Connection secure, then More Information, then View Certificate, I see that the connection uses this RSA-2048 public key:

This means that at each new connection, the browser performs modular arithmetic with 2048-bit integers, that is 616-decimal digit integers.

In order to authenticate the server (or, in a previous operation, to check its certificate, which must be done at least once on the first connection), it is computed A65537 mod M for the 2048-bit M in the picture, and A of the same size. Since 65537 = 216+1, that requires 17 modular multiplications. Each can be (and often is) performed by multiplying two 2048-bit integers into a 4096-bit integer, followed by modular reduction by way of an other multiplication of 2048-bit integers.

That arithmetic is performed using limbs (the equivalent of decimal digits) that typically are 32-bit (sometime 64-bit, or 16-bit on low-end mobile devices). It is thus performed multiplication of integers of width N = 64 limbs. With the schoolbook algorithm, each multiplication requires N2 multiplications of two limbs and additions of the result, each requiring in the order of 50 CPU clock cycles. At 1 GHz, we are talking 17×2×64×64×50×10-9 s that is ≈7 ms, which is not negligible because establishing an https connection (or checking a certificate) is so common.

In order to reduce delay and power consumption, it pays to use at least the simplest of the below-O(N2) multiplication algorithms: Karatsuba multiplication, which is O(N≈1.6). There is a threshold before that pays (especially on modern CPUs with fast multipliers), which can be down to about 10 limbs (reference). For 64×64 limbs, Karatsuba would typically reduce computing time by a factor of nearly (4/3)2 ≈ 1.7, which is better than nothing. That’s part of why implementations based on GMP are faster. For low-end devices with 16-bit limbs, or when doing 4096-bit RSA, that’s a factor (4/3)3 ≈ 2.3, and quite worth using.

On the server side, there are more computations (about 50 times more work) and that can sometime represent a sizable fraction of the total workload, but the incentive to use Karatsuba for the bulk of the work is actually lower: the numbers manipulated are half as wide, and sometime the limbs are bigger.

There are other applications of Karatsuba and its generalization Toom-Cook in cryptography, not limited to RSA; e.g. batch verification of ECC signatures, see Daniel J. Bernstein’s Batch binary Edwards. In the specialized subfield of cryptanalysis, there even are uses of Schönhage-Strassen, e.g. cryptanalysis of ISO 9796-2 signatures. It’s in GMP for a reason.

The recent Harvey-Hoeven algorithm is a satisfying achievement, but is not going to be used in practical applications. I even doubt that it can ever be implemented: it seems to work for numbers in the order of 172912 bits, which is about 1022 times the RAM in a current supercomputer.

Even the simplest better-than-schoolbook (O(n^2)) algorithms like Karatsuba are only useful in practice for large `n`. But what is `n`? It’s not single bits, and it’s not decimal digits. (Posting this tangent as requested in comments.)

Software implementations of an extended-precision multiply algorithm work in integer chunks as wide as the hardware provides. On a 64-bit CPU, that’s usually 64×64 => 128-bit integer multiplication, e.g. the x86-64 `mul` instruction. (@fgrieu’s answer has more detail on this, including the term “limb” for such a chunk.)

That fixed-width CPU instruction runs in fixed time (regardless of the value on most CPUs; division is the only instruction that’s slow enough to justify variable latency in a modern pipelined CPU, and in the most recent x86-64 CPUs even it’s constant). e.g. on modern Intel and AMD CPUs, `mul r64` or `mulx` have a throughput of 1 per cycle and a latency of 3 to 4 cycles (for the low and high halves of the output, respectively: https://www.uops.info/html-instr/MUL_R64.html).

Hardware doesn’t “know” it’s doing one big multiply, it’s just doing each fixed-width part separately. Hardware can easily be parallel (adding partial products) if you can throw enough transistors at the problem. HW multipliers in CPUs use the Dadda tree design. This is simpler than doing 63 additions of shifted versions of the other 64-bit input (or `0` where this input has a 0 bit) using normal adders: carry propagation can be deferred. Hardware tricks like that are AFAIK unrelated to any of the sub-N^2 algorithmic tricks.

Such a multiply instruction, and add-with-carry, are the building blocks for schoolbook multiplication’s O(n^2) time complexity. e.g. 128-bit multiplication (producing a 128-bit result) takes 3 multiplies on x86-64: https://godbolt.org/z/qBAbfQ. To also produce the high half, all of those multiplies would have to be “full” 64×64=>128 instead of only 64×64 => 64 for the low x high and high x low cross products, and we’d need to do the high x high product, for a total of 4 `mul` instructions.

e.g. this SO answer shows 32×32 => 64-bit multiply using 16-bit x86 so each input is 2 limbs, and the output is 2+2 = 4 limbs, requiring 2*2 = 4 multiplies of 16×16 => 32 bits each. Exactly the same pattern would apply for 64×64 => 128 on a 32-bit machine, or 128×128 => 256 on a 64-bit machine.

Since that building block is opaque to software, and/or shuffling individual bits around would be much more expensive than it’s worth, `n` is only 64 for 4096-bit integer multiply.

To allow better instruction-level parallelism (letting superscalar CPUs do the same work in less time) and reducing overhead of `mov` instructions, Intel introduced (in Broadwell) the ADX extension that allows two parallel dependency chains of add-with-carry. This whitepaper shows the advantages it gives for small problems (like 512-bit x 512-bit multiplication (8 x 8 limbs)).

For floating-point, an FP multiplier involves an integer multiplier for the 53×53-bit => 53-bit correctly rounded mantissa (the most significant 53 bits of the full integer product) plus hardware to add the exponents, and check for / handle overflow / underflow and NaN. See Why does Intel’s Haswell chip allow floating point multiplication to be twice as fast as addition? for some info about how FP ALUs are designed, and the barely-related question of why Intel made the design choices they did in Haswell and Skylake.

To get extra FP precision, one technique is so-called “double-double“: wide mantissa using two `double`s, but only the exponent from one of them. Using that only takes a handful of double-precision math operations, like 6 to 20 depending on which operation and whether FMA (fused multiply-add without intermediate rounding) is available. The relevant width is n=2 doubles, not n=36 decimal digits. (And IEEE FP is a binary format, not decimal, although there are decimal FP formats that exist, with some CPUs even having hardware support for them, such as PowerPC.)

Note that a SIMD multiplier just replicates that for each SIMD element. double-double can SIMD efficiently if you store separate vectors of lo / hi halves so you don’t need to shuffle to line up the corresponding halves of a single number. e.g. this Q&A.

### Other extended-precision number representations

You could store numbers as an array of bytes, each byte holding a single decimal digit. But that’s pretty terrible. Historically, it was not uncommon to use a simplistic format like that, especially for a score counter in a game that gets printed on screen in decimal format constantly. Or BCD (2 decimal digits per 8-bit byte, each in a separate 4-bit nibble).

But this is pretty bad, especially for multiplying numbers stored in this format, because then `n` becomes large and complexity scales with N^2 (for the simple schoolbook algorithm).

@davidbak commented:

w.r.t. “nobody uses decimal digits as an extended-precision format” – is that true? I know there used to be implementations of multi precision integer arithmetic that used the largest power of 10 that would fit in a word as the base – e.g., 10^9 for 32-bit machines. Made conversions to<->from a human-readable base 10 notation much easier and cost only a “reasonable” overhead for some definition of reasonable that might depend on your use case. Is that not done anymore? (Although strictly speaking those aren’t decimal digits, just power-of-ten digits…)

Indeed, larger powers of 10 could be sane when you need frequent conversion to/from a decimal string, or multiply/divide by powers of 10. But then a 36-digit number is 4 chunks of 9, not 36 chunks of 1. e.g. one use-case was printing the first 1000 decimal digits of `Fib(10^9)` (x86-64 asm code-golf) where it’s handy to have right shift by 1 limb be division by a power of 10, and for conversion to decimal to only need to consider the current limb, converting that to 9 decimal digits without having to do extended-precision division where the remainder depends on all higher bits.

See also this code-review answer about an implementation based on single decimal digits. I included some details about what CPython does, and some other links. It’s not rare for beginners to come up with that as an idea, but non-toy libraries use at least 10^9 as the base for “limbs”, unless we’re talking about BCD.

Or more commonly binary extended precision using all 32 bits per 32-bit integer, or sometimes only 2^30 to leave room for high-level language handling of carry in/out (like in CPython) without access to an asm carry flag.

Another advantage of leaving some spare bits per limb is to allow deferred carry normalization, making SIMD for addition efficiently possible. See @Mysticial’s answer on Can long integer routines benefit from SSE?. Especially for extended-precision addition, leaving some slack in each limb is actually interesting if you design around that format with awareness of when to normalize as an extra step. (@Mysticial is the author of y-cruncher and also works on Prime95; he implemented its use of FP-FMA to take advantage of the FP mantissa multipliers for bit-exact integer work.)

That answer also points out that “really large bignum” multiplications can be done as an FFT.

Normally (with standard techniques) it’s very hard to take advantage of SIMD for extended-precision; within one operation, there’s a serial dependency between each element: you don’t know if there’s carry-in to this element until you process the previous element (for addition).

For multiplication, it’s usually even worse: SIMD doesn’t usually have very wide multipliers, and with the result being twice as wide as the inputs it’s a problem where to put them.

The amount of work done by one building block should be measured as the “product bits” you compute per cycle, e.g. 64×64 => 128-bit full multiply does 64×64 = 4096 units of work. But a 4x 32×32=>64-bit SIMD multiply (like AVX2 `vpmuludq`) does `32^2` = 1024 units of work per element, with 4 elements, for a total of 4096 units of multiply work. And it leaves more adding of partial products not done. So even in theory, ignoring other factors, AVX2 `vpmuludq` on a 256-bit vector is break-even with scalar.

AVX512 has 64×64 => 64-bit multiply (but still no way to get the upper-half of the full result so it’s no more helpful for BigInteger than 32×32 => 64, I think). AVX512IFMA more directly exposes what the FP mantissa multipliers can do, providing separate low and high half 52×52 => 104-bit multiply.

(Other SIMD integer multiply instructions like `vpmulld` that do 32×32 => 32-bit usually decode to two separate uops for the vector-ALU ports, so they can use the same per-element multipliers as FP mantissas. But those multipliers are only 52×52 or 24×24-bit. Making them wider would cost significantly more for these wide SIMD ALUs, and only help the fairly rarely used SIMD-integer multiply instructions.)

## Practical importance: compactifying explanations

It is widely believed that $$mathcal{O}(n log n)$$ is the best possible result, and therefore we no longer have to say $$mathcal{O}(nlog ncdot 2^{2log^*n})$$ every single time in every single paper in related fields, we can just say $$mathcal{O}(n log n)$$ every time now. Here is a related quote from Reddit:

“The result is of extreme practical importance. Not for actually
multiplying integers (as usual with these algorithms it’s probably not
faster than existing algorithms for integers that can be stored in the
observable universe), but for writing papers. It has always been a
hassle to write down the complexity of integer multiplication or
algorithms based on integer multiplication by introducing soft-O
notation, little-o exponents, epsilons greater than 0, or iterated
logarithms. From now on I can just write O(n log n) in my papers and
be done with it!”

While this might not be the answer you’re looking for, about practical impact on computations, it does in fact answer the question of “What is the practical value of this algorithm?”

## Making Game: How do I clear the cache / history in Windows 7?

In Windows XP there was an option off the task bar properties to clear all recently accessed webpages, recent file lists, ie. cache, temp files and so on; is there an equivalent for Windows 7?

1. Right-click on the taskbar or the Windows Orb.
2. Open “Properties”.
3. On the “Start Menu” tab, click on “Customize…”.
4. In the “Customize Start Menu” windows, check the checkbox “Recent Items”.
5. Click OK twice to close the 2 open windows.
6. In the Start Menu, you have now a “Recent Items” entry.
7. If you right-click it, you have a menu option “Clear recent items list”.

Tagged : / /

## Linux HowTo: How do I clear the cache / history in Windows 7?

In Windows XP there was an option off the task bar properties to clear all recently accessed webpages, recent file lists, ie. cache, temp files and so on; is there an equivalent for Windows 7?

1. Right-click on the taskbar or the Windows Orb.
2. Open “Properties”.
3. On the “Start Menu” tab, click on “Customize…”.
4. In the “Customize Start Menu” windows, check the checkbox “Recent Items”.
5. Click OK twice to close the 2 open windows.
6. In the Start Menu, you have now a “Recent Items” entry.
7. If you right-click it, you have a menu option “Clear recent items list”.

Tagged : / /

## Server Bug Fix: Did Neil Armstrong write such a letter to a skeptic?

Today, I saw a tweet with an image that claims to contain the answer of Neil Armstrong to a teacher that was skeptical of the moon landing (see below).

Of course it’s not the image of a real letter, that’s not the point. It’s about the content.

What made me suspicious is the phrase “Or you could get on the net and find …”. There’s no date, the letter could be from a time when the Internet did exist but the whole “feeling” of the letter makes me think it was written a few years after Apollo 11, at the latest. So either the phrase is a saying I as a non-native speaker am not aware of, or it does refer to the Internet which would make this even more suspicious to me.

I tried finding the letter online elsewhere but only ended up finding the same tweet again.

So, did Neil Armstrong write such a letter to a teacher?

The Twitter thread gives an explicit source for this letter, ‘A Reluctant Icon: Letters to Neil Armstrong’. by James R. Hansen. This book exists, from Purdue University Press, ISBN 9781557539694

The publisher’s blurb says:

Artfully curated by James R. Hansen, A Reluctant Icon: Letters to Neil Armstrong is a companion volume to Dear Neil Armstrong: Letters to the First Man from All Mankind, collecting hundreds more letters Armstrong received after first stepping on the moon until his death in 2012.

I also found an extract online here which includes the notice:

The majority of the letters featured in this volume are from the Neil A. Armstrong papers in the Barron Hilton Flight and Space Exploration Archives, Purdue University Archives and Special Collections.

This is expanded on in the preface, which discusses the editor’s previous work writing an authorised biography of Armstrong, and how Armstrong’s papers were subsequently donated to Purdue University (where Armstrong studied) and form the basis of these two volumes.

This is corroborated by a quick online search, e.g. a letter to the New York Times and Fox News.

The table of contents lists an entire chapter on “Quacks, conspiracy theorists, and UFOlogists” spanning pages 55 to 106.

@hdhondt’s comment on the question made me aware that I totally missed the image is embedded in a thread and it’s about the book “A Reluctant Icon: Letters to Neil Armstrong” . It seems the letter of the teacher is from 2000, so Neil does indeed mean the Internet. That makes the whole story quite plausible.

What made me suspicious is the phrase “Or you could get on the net and find …”. There’s no date, the letter could be from a time when the Internet did exist but the whole “feeling” of the letter makes me think it was written a few years after Apollo 11, at the latest

The Tweet has not provided you the letter to which Armstrong was responding. That letter is not only dated, but makes references to what the writer Armstrong’s current age at the time (or what it likely was by the time he received the letter), and that the landing took place thirty years ago. It also encourages Armstrong to visit websites with material debunking the landing (that is, if he’s not too old/ignorant to know about the Internet).

Tagged : /

## Server Bug Fix: Mathematical expression of SCAN (Strongly Constrained and Appropriately Normed) constraints in DFT

(This question is originally posted on physics stackexchange, but someone suggested me to post on this site, so there you go)

I’m compiling the mathematical expression of SCAN (Strongly Constrained and Appropriately Normed) functionals’ constraints, but apparently they are not very obvious from their paper (at least for me).
I have compiled some constraints from the SCAN paper, the PBE paper, and Perdew’s presentation, but some are missing (see the last line of this question).

General form

begin{align} E_{xc}[n] &= int n varepsilon_x^{unif}(n) F_{xc}(s,alpha) mathrm{d}mathbf{r} \ E_x[n] &= int n varepsilon_x^{unif}(n) F_x(s,alpha) mathrm{d}mathbf{r} \ E_c[n] &= int n varepsilon_x^{unif}(n) F_c(r_s,t,zeta,alpha) mathrm{d}mathbf{r} = int nleft[varepsilon_c^{unif} + H(r_s,t,zeta,alpha)right] mathrm{d}mathbf{r} \ end{align}
where $$varepsilon_x^{unif}(n) = -(3/4pi)(3pi^2n)^{1/3}$$ and $$varepsilon_c^{unif}$$ are obtained from Perdew & Wang, 1992 and the variables $$s,alpha, r_s,t,zeta$$ are listed in SCAN’s paper supplementary material.

Exchange constraints

1. Negativity
$$F_x(s,alpha) > 0$$
2. Spin-scaling
$$E_x[n_{uparrow}, n_{downarrow}] = frac{1}{2}left(E_x[2n_{uparrow}] + E_x[2n_{downarrow}]right)$$
3. Uniform density scaling
$$E_x[n_gamma] = gamma E_x[n]mathrm{, where} n_gamma(mathbf{r}) = gamma^3 n(gamma mathbf{r})$$
4. Fourth order gradient expansion (the expression is from Perdew’s presentation)
$$lim_{srightarrow 0, alpharightarrow 1} F_x(s,alpha) = 1 + frac{10}{81}s^2 – frac{1606}{18225} s^4 + frac{511}{13500} s^2(1-alpha) + frac{5913}{405000}(1-alpha)^2$$
5. Non-uniform density scaling
$$lim_{srightarrowinfty}F_x(s,alpha) propto s^{-1/2}$$
6. Tight lower bound for two electron densities
$$F_x(s,alpha=0) leq 1.174$$

Correlation constraints

1. Non-positivity
$$F_c(r_s,t,zeta,alpha) geq 0$$
begin{align} lim_{trightarrow 0}H(r_s, zeta, t, alpha) &= beta phi^3 t^2 \ beta &approx 0.066725 end{align}
3. Rapidly varying limit (using the term from PBE’s paper, instead of from SCAN’s paper, is it “Uniform density scaling to the low density limit“?)
$$lim_{trightarrowinfty}H(r_s, zeta, t, alpha) = -varepsilon_c^{unif}$$
4. Uniform density scaling to the high density limit
begin{align} lim_{r_srightarrow 0}H(r_s, zeta, t, alpha) &= gamma phi^3ln left(t^2right) \ gamma &= frac{1}{pi} (1 – ln 2) end{align}
5. Zero correlation energy for one electron densities
$$H(r_s, zeta=1, t, alpha=0) = -varepsilon_c^{unif}$$
6. Finite non-uniform scaling limit (I don’t know this)

Exchange and correlation constraints

1. Size extensivity (I don’t know this)

2. General Lieb-Oxford bound
$$F_{xc}(r_s, zeta, t, alpha) leq 2.215$$

3. Weak dependence upon relative spin polarization in the low-density limit (I don’t know this)

4. Static linear response of the uniform electron gas (I don’t know this)

5. Lieb-Oxford bound for two-electron densities
$$F_{xc}(r_s, zeta=0, t, alpha=0) leq 1.67$$

Summary: What are the constraints for 12, 13, 15, 16? If you want, you can give one constraint in one answer.

## Constraint #13: Size-Extensivity

While the Wikipedia page for size-consistency and size-extensivity gives a clear formula for the definition of size-consistency, unfortunately they did not give a definition of size-extensivity, so I had to look deeper into the reference that they provided. They say that size-extensivity was introduced by Bartlett, and they cite this review paper of his from 1981, but this paper itself credits the following papers, which I have now looked at for the first time and summarized below:

• (1955) Keith Brueckner first recognized in his study of the uniform electron gas, that some terms in the energy obtained by Rayleigh-Schroedinger perturbation theory, incorrectly do not scale linearly with the number of electrons $$N$$ as $$Nrightarrow infty$$. He found a way to cancel out all of these spurious terms, up to fourth-order in the perturbation theory. These spurious terms are also the reason why CI (configuration interaction) converges slowly with respect to the number of excitations included. A year later, Brueckner published the paper with Gell-Mann that became the subject of my other answer here. He was one of the all-time greats, and lived to 90.
• (1957) Jeffrey Golstone proved the “linked-diagram theorem” which ensures that the spurious terms found by Brueckner, get cancelled out to all orders in perturbation theory. Goldstone by the way, is one of the most influential physicists that’s still alive! He’s currently 85 and he was even a co-author on the quite recent paper that popularized Adiabatic Quantum Computing 🙂
• (1965) Bartlett’s review paper says that Hans Primas was actually the one to first really emphasize this concept of having proper scaling. I don’t know much about Primas, though I found that he survived to the age of 86 🙂
• (1971-1973) Wilfried Meyer used this concept of size-extensivity to justify the CEPA model. At the time, Meyer had just finished making the MOLPRO software in the 1960s, a software which is now more than 50 years later, maybe the most popular quantum chemistry software for fast high-accuracy calculations.
• (1978) Bartlett and Purvis used the term “size-extensivity” here, so this is perhaps where the term was first introduced, but he uses it to describe what the 1955 and 1957 papers achieved.

So what is size-extensivity?

My reading of the above Bartlett papers tells me that for a homogenous system like an electron gas or a set of non-interacting He atoms, the energy should scale linearly with the number of particles and that the concept can also be generalized to properties other than energy.

## Making Game: Disable Windows search history

Question: how can I disable Windows search history? (Screenshot)

What I did: disabled “My device history” and “cleared my device history” from Windows Permission & History settings. All data-collection settings from Windows privacy settings are disabled. Yet, it still remembers my search history.

Update: it seems that there’s no option to disable it. Just to add more clarification to the solution, here’re the steps for scheduling a cleaning task: `Task scheduler > Create task`. Name it, select run with highest privileges and choose the trigger (e. g: on startup). Now, in the actions, add an action and fill in the following:

1. Program/script: `powershell.exe`
2. Add arguements: `Remove-Item -Path 'HKCU:SOFTWAREMicrosoftWindowsCurrentVersionExplorerWordWheelQuery*'`

Note: for me, it didn’t work without removing the asterisk at the end of the command.

Yet, it still remembers my search history.

File Explorer search history cannot be disabled but it can be cleared. After performing a search you can easily clear your search history through the UI.

If you are attempting to automate this functionality you will have to write a script to delete the following registry key.

``````HKEY_CURRENT_USERSoftwareMicrosoftWindowsCurrentVersionExplorerWordWheelQuery
``````

I absolutely would test this script before you implement it but something as simple as the following PowerShell command.

``````Remove-Item -Path "HKCU:SOFTWAREMicrosoftWindowsCurrentVersionExplorerWordWheelQuery*"
``````

The simplest solution would be to create a Scheduled Task, that runs a PowerShell script when the user logs out of the computer, this would clear the recent search history.

Tagged : / / /

## Linux HowTo: Disable Windows search history

Question: how can I disable Windows search history? (Screenshot)

What I did: disabled “My device history” and “cleared my device history” from Windows Permission & History settings. All data-collection settings from Windows privacy settings are disabled. Yet, it still remembers my search history.

Update: it seems that there’s no option to disable it. Just to add more clarification to the solution, here’re the steps for scheduling a cleaning task: `Task scheduler > Create task`. Name it, select run with highest privileges and choose the trigger (e. g: on startup). Now, in the actions, add an action and fill in the following:

1. Program/script: `powershell.exe`
2. Add arguements: `Remove-Item -Path 'HKCU:SOFTWAREMicrosoftWindowsCurrentVersionExplorerWordWheelQuery*'`

Note: for me, it didn’t work without removing the asterisk at the end of the command.

Yet, it still remembers my search history.

File Explorer search history cannot be disabled but it can be cleared. After performing a search you can easily clear your search history through the UI.

If you are attempting to automate this functionality you will have to write a script to delete the following registry key.

``````HKEY_CURRENT_USERSoftwareMicrosoftWindowsCurrentVersionExplorerWordWheelQuery
``````

I absolutely would test this script before you implement it but something as simple as the following PowerShell command.

``````Remove-Item -Path "HKCU:SOFTWAREMicrosoftWindowsCurrentVersionExplorerWordWheelQuery*"
``````

The simplest solution would be to create a Scheduled Task, that runs a PowerShell script when the user logs out of the computer, this would clear the recent search history.

Tagged : / / /

## Server Bug Fix: Why were computer customers called “Users”?

The term User for computer hardware and software customers has been universal for as long as I can remember. It has always applied to both hardware and software customers – There were “Lotus Users” and “WordPerfect Users”, just as Commodore and Sinclair Users.

I struggle to think of other industries that refer to their customers as “Users”, besides the computer technology industry.

It shouldn’t be about technology. I don’t recall CD and VCR producers calling their customers Users when those high-tech products were introduced, and I think the same applies to Nintendo and Atari (game consoles) too.

And it shouldn’t be about things that are objectively tools. I have never heard of someone buying a hammer or drill referred to as a Craftsman or Dewalt User.

Other industries selling high-cost durable goods, such as automobiles, don’t have Users either, though they often have “Owners”. It seems to me that paying as much for a micro in the 1980’s as a decent car cost should have qualified you as both a valued Customer and an Owner, so why a User?

Notably, the term does make perfect sense in 2020, where we have countless Facebook and Twitter Users who do not warrant being called Customers, since they pay no fees, and certainly are not the Owner of anything. So that actually fits – they simply Use.

There ought to be some objective history and reasoning behind the ubiquitous term User, instead of simply being called a customer, like in all other industries. From where did this demand to be called a User originate?

Update: If, as many suggest, the term User originated because it was an early and specific role, what did that role originally convey in terms of the person’s competence, privileges, or restrictions regarding the machine? And does the same role definition thus roughly translate into similar expectations for a customer purchasing an early micro-computer for the home or office?

‘User’ and ‘customer’ aren’t the same.

The user is the person (always a person) who uses a computer system to do something.

The customer is the person or organization who pays for the hardware and software used by the user.

The customer can be identical to the user (private or freelance personal computer user). The customer can be the user’s boss (small firm) or organization (large company); it can also be a different unit of an organization from the one the user is in (IT buys software, business uses software). There are scenarios without customers (free-as-in-beer software), and scenarios without users (custom software that never gets used because by the time it’s finished, the business case has changed).

Note that the term ‘user’ goes back to at least 1961, when DECUS, the Digital Equipment Computer Users’ Society, was founded.[1] At that time, the only computer DEC was selling on the open market was the PDP-1, at a price of US\$120,000 [2]. This price was so high that the customer always was an organization of some sort; however, it was also so low (compared to other computers at the time) that a diverse group of people — not just operators, but also programmers, researchers, students, etc. — could use the computer directly. It appears a collective noun was needed, though I don’t know if the term originated at DEC. From then on, the terms ‘user’ and ‘user group’ seem to have stuck [3], eventually entering common use.

[1] Notably, the SHARE IBM user group pre-dated DECUS by 6 years, but did not call itself a user group at that time.

[2] equivalent to \$1,030,000 in 2019

[3] Note all the user groups popping up in the 1960s and 1970s.

## TL;DR:

User simply coins what is it about, the generic usage of something – differentiating it from any other role. And let’s be honest, a computer is such a generic device, that it’s use can be manyfold – from typist to gamer and accountant to engineer. So any more specific name would miss out other practices.

## In Detail:

Notably, the term does make perfect sense in 2020, where we have countless Facebook and Twitter Users who do not warrant being called Customers, since they pay no fees, and certainly are not the Owner of anything. So that actually fits – they simply Use.

And it does as well make perfect sense with computing up and into the ’80s, were, if at all, only a very tiny minority of people using computers also owned them.

There ought to be some objective history and reasoning behind the ubiquitous term User, instead of simply being called a customer, like in all other industries.

Beside the historical part of not owning at all, naming implies a viewpoint which changes according to who observes what.

• Owner only means you bought something, but says nothing about its application. An owner may never use the computer he bought for his accounting department.

• Customer is simply the viewpoint of a merchant during a sales transaction.

After that it’s about, well, using the thing a customer bought and now owns. So, how do you call owners handling their (non-computer) assets?

• Driving a car:
• Driver
• Riding a bike
• Biker
• Watching TV
• Viewer
• Listener/Audience
• Using a stove
• Cook
• Using a plane/bus/…
• Passenger

…and so on.

For computers one could have called them ‘Computer’ which would have made sense – except the term were taken for the machine itself (and people before), so that wouldn’t work. Also, computer application is so manyfold and new at the same time, that it might have been hard to coin a single term to cover it all. So they were simply ‘people using computers’, shortened to users.

It might be helpful to look back to another times changing introduction the Automobile. Before one settled on the archaic ‘driver’ (again) other names from Automobilist and Conductor to Motorist were tried for some time, because, let’s be serious, no-one ‘drives’, in it’s genuine sense, a car. It’s steered or operated.

Pilot for a plane is another nice twist over time.

Add on from a comment by Brian:

Sure. I agree with this distinction. [between customer and user] But it doesn’t explain why millions of customers who bought their own gear seemingly wanted or preferred to be called Users, or why the producers saw that as the more appropriate way to address them

It’s important to keep in mind that the role of Customer is exhausted after the bill is payed. Also, not the user selects the naming, but the ones writing manuals and advertisements. So tell me how a manual for a memory board should address its user? There is no way they know what the board will be used for. Similarly, the manufacturer of a computer or the publisher of a spread sheet software.

While it may work with one time reads, like on a fridge, to call the reader a ‘customer’, it gets more inappropriate when time passes, and even more when it’s documentation (books, magazines, etc) that are in no relation with the seller.

User is simply a colloquial term unspecific enough to cover everyone in front of a device made for anything.

And last, but not least, at the time “millions of customers who bought their own gear” became a thing, the term was already coined and in wide use. Try to get people to not say ‘driver’ when referring to a vehicle operator, just because they now own one as well.

And while writing this I realize more and more that this is not a question for RC.SE at all, but maybe some language site, like RichF suggests.

Customers buy things. Users just use them.

The term ‘user’ arose in the days of the mainframe computer where many users connect to a single computer, often using the same shared terminals during the day. The individuals need tracking so ‘users’ and ‘user accounts’ were eventually born.

The notion of others in the workplace being called ‘customers’ was a later thing. I first saw it at work in the early 90’s, with training on ‘customer-driven quality’ from the premise that ‘all of us are each other’s customers’. Prior to that, I didn’t see customer used anywhere outside of the sales environment. It would have been a weird choice to use in the days when ‘user’ was adopted.

The term “computer user” is analogous to “automobile driver”. (It is an even better fit, because “driver” specifically excludes passengers.) Many people, in many cases most, who use computer hardware or software are not the purchasers, customers, operators, programmers, administrators, or gamers. I can think of no better term for “user” which encompasses almost every aspect of interacting with a computer. Would you prefer “interactor”?

If you would like an in-depth etymological answer, consider re-asking this question in the English Language and Usage SE.

The word “user” was applied to operators of word processing and office equipment well before computers were commonplace.

Google Books reveals citations back to the 1890s for “typewriter user.” Manufacturers appeared to use this terminology as well, such as in this Remington ad from 1920: “Today, as always, the typewriter user who wishes to reach the lowest cost level of typing must go to the Remington.” Similarly, here the operator of an automatic inkstand is described as a “user” in 1895, and much the same phrasing is used for the operator of a new type of eraser and a pencil sharpener. That one edition of “The Book-Keeper” uses the word “user” 16 times, most all referring to the person who uses office equipment. This January 1891 publication of “The Office: A Practical Journal of Business” uses the word “user” repeatedly to describe those making use of all kinds of office equipment and supplies.

It seems logical then that as new types of automated data processing equipment began to appear, that the term “user,” which was already applied to the existing equipment, would apply to computers as well.

It is one of the terms used originally to distinguish between the various roles involved in the creation and use of hardware and software. As a term it probably made much more sense in the days of room-sized mainframes where everyone had a lab coat on.

There are:

• People who sell computers: vendors

• People who buy computers: customers of the vendors (and for pre-personal computers, this was generally an ongoing relationship)

• People who program the computers that were bought: programmers

• People who operate the machinery that is the computer while it runs the programs that the programmers have written: operators

• People who, with the possible assistance of the operators (to load cards and tapes, burst lineprinter paper, etc), use the programs the programmers have written: the users

To further Pete Kirkham’s answer; IBM’s TSO (Time Sharing Option) was one of the more popular software options that allowed customers of IBM to rent out time on their computers (a.k.a. mainframes of the time) to users that didn’t need/want/afford a computer of their own. This solidified the concepts of

1. “Users” (people that asked the computer to do work for them),
2. “Operators” (People in charge of making sure the computer stayed running and used the special console to interact with the operating system with “godlike powers”),
3. “Engineers” (people that created the programs that ran on the computers)
4. “Owners/Customer” (people/companies that actually paid for the hardware and software)

“User groups” became an extension of the concepts where groups of people that used the software to accomplish their work would meet and learn from each other. Many arcane tips and tricks were passed around in these groups, and in many cases the groups would be anchored by “neck beards/guru’s that were in many cases the engineers that wrote the software and were looking for ideas on how to make the software better and more usable.

One might infer this from what the other answers said, but I don’t see it explicitly stated.

A person who uses a computer is a user.

This “-r” suffix to refer to the person performing an action is just a feature of the English language.

There isn’t really a better verb to summarise everything you do on a computer than “use”, thus “user”. With a (modern) computer, you type on the keyboard, you move the mouse, you look at the screen, you possibly touch the screen, you may also interact in other ways with other peripherals. You can also look at what you’re doing at a higher level, but this still doesn’t really give a better verb than “use” (with the exception of some specific types of applications, e.g. a gamer games or a player plays games, but this doesn’t generalise to the computer as a whole).

You can see this language feature in other places too:

• A person who drives a car is a driver.
• A person who rides a bike is a rider.
• A person who views a television show is a viewer.
• A person who manages others is a manager.
• etc.

The person paying for the software is very often not the same person using the software.

I currently make support bots, the companies buying our product are our customers, the people using it are our users. A university with an Office 365 subscription is a customer, all the staff and students are users. For Gmail, the advertisers are the customers, the people using it are users.

Back in the old days, main-frames were widespread, so the same situation applied. The entity owning the main-frame was the customer, the people running software on it were users.

You need clear and concise definitions. E.g. SLAs for customer support ticket response times vs user support tickets are usually widely different.

Following from Big Mike’s recollection, which matches mine own, I had a look into CTSS which was about the first system with more than one user to see if I could find a source of the terminology earlier than 1961.

First movers often set terminology for a technical domain. John Backus uses the term when discussing the ‘Effect of Automatic Coding on Machine Design’ in 1954, and if John Backus gives something a name it tends to stick.

Mr. P.F. Williams said that his firm is trying to decide on a computer
to use. They want something intermediate between an IBM 650 and 704. The
704 seems too large: they ‘would not be able to keep it busy’. John Backus
said that by time sharing, a big computer could be used as several small
ones; there would need to be a reading station for each user. Mr H. Freeman
remarked that down time would be worse for one large machine than for several
small ones. Stanley Gill said that the idea of a centralised computing
bureau seemed a good one, except that if each subscriber were to have his own
input-output equipment he would not be able to afford such flexible equipment
Prof. Adams pointed out that the idea was similar to a centralised stenographic bureau,which was not always successful.

D.L. Shell said that Bell Laboratories had in fact used a central computer with remote
input-output very successfully. He said the cost per operation of a machine is
roughly proportional to the square root of the time per operation; also the work
load increases exponentially with the time. He felt that where elapsed time is
vital in case of a breakdown, a large computer is still preferable.

(Automatic coding – compilers generating machine code rather than doing it by hand)

I don’t know which system Backus was thinking of, if any, as this from the decade before any IBM system in Wikipedia’s timeline, so I don’t think there was a specific ‘role’ for the term, rather the ‘customer’ in the conversation would have been P.F. Williams’ firm rather than the people operating the reading stations. Note that the other people in the discussions used different terms, subscriber above and, earlier in the discussion, customer is mentioned, though it’s not clear if that refers to a user rather than a university:

Dr Grace Hopper raised the possibility of using several small computers in parallel. The greatest demand was for small machines, and she hoped that each university would eventually have one. […] She foresaw a mass produced small machine, delivered with a compiler and library appropriate to the customer’s needs.

Mr J.W. Backus disagreed[…]

Programmers and engineers are also mentioned in the discussion; however in the context of their use, the programming takes place before the engineers finalise the hardware design, so it seems that they are not considered users.

Dr J.C.P. Miller then proposes something sounding a lot like RISC computing. So users, not-quite personal computers and no-quite RISC all in just three pages of discussion from 1954.

As someone who was involved in designing minicomputers in the 70’s I’d like to add that from the point of view of an engineer developing the computer they might be seen as “mere” users. Also seen in user mode vs system/kernel mode. To system code developers application programmers can be seen as “mere” users.

I also agree that until the personal computer era customers who bought computers and users who programmed them were not the same people.

By my recollection, computers became multi-user, meaning that multiple people could operate the same tool simultaneously. Each user had “user space”, meaning values assigned to do their work, and shared or system space, meaning it held values assigned to operate the system or exchange information between users. This was different from batch-oriented systems that relied on card stacks or tapes to fetch the next operation the computer was supposed to act upon.

I started programming on cards in 1969. Work performed on the computer was a “job” that ran in “batches”. Hand in your stack of cards, pray there wasn’t a single typographical error, and you’d get a job report printout on a paper report of your program results.

I saw the first use of an on-line terminal in 1976 at the university I was attending. I counted sevens rows of students packed watching one guy using the terminal, and we were completely blown away.

Now computers had two segments of work to manage. Jobs continued as they had from the beginning, but “user” management was required to deal with on-line terminals needs such as data entry. One had to allocate scarce CPU cycles & memory between batch and user consumption.

The term user was coined to encapsulate the needs of managing security, resource consumption, etc that came from people running work a computer that now had multi-tasking capabilities.

Before IBM popularized personal computers which led to the dominance of Microsoft, computers were huge. They were owned by departments, institutions, universities. Nobody owned a computer. In order to program or do other work on a computer, you had to be given permission to use it. So it was a machine that you shared with many other people.
You became a user of a computer when you belong to an organization. You don’t (usually) pay a fee to use a computer so you were a user, not a paying customer.

Because ‘user’ wasn’t in any way synonymous with ‘customer’, if it’s in an organization then the organization itself might be the ‘customer’, but all the people who use it (in that org and everyone else who ever accesses/makes that copy) are ‘users’; also, because licenses expire and PC software copying was rife back in the 1980s/1990s, so it was quite common that many of the ‘users’ aren’t ‘customers’, or never had been.

Yes there is some objective history and reasoning behind this:

1. The analogies to hardware aren’t valid (CD, VCR, game consoles, etc., or cars). You can’t duplicate hardware (well not easily), and typically there is one customer and that person is also the user. This is self-evidently not the case with software.
2. As to Facebook and Twitter ‘users’, they aren’t ‘customers’, they don’t pay any licensing fee, so they can’t have damages, and IIUC since there isn’t any ‘consideration’ (money or other thing of value) they don’t even have a contract with FB (IANAL); (they have a ‘user agreement’, but that’s different, its terms mutate constantly, and it’s very one-sided). Facebook’s actual ‘customers’ are the people/companies who sell advertising on it, and/or pay or have licensing agreements for access to its datastream. Hence the adage: If you’re not the customer, you’re the product”.
3. It was quite common that many of the ‘users’ weren’t ‘customers’.
PC software copying was rife back in the 1980s/1990s (before US and other countries passed laws on DRM, and mandated building copy-protection/detection for content into the OS (Windows) itself and hardware (blocking or fuzzing the display of non-DRM content); there used to only be more basic software copy-protection, serial-port hardware dongles, and challenges to see if you owned the user manual (“What is the second word of the fifth sentence on page 208?”). Another anecdote: AutoCAD, the most widely-used(/pirated) CAD suite, was said to only have 7 valid (= purchased) license keys in all of Russia in the early 90s.
4. Even in a organization (or group) which had legit bought(/”licensed”) the software, the ‘customer’ is the organization itself (or in a group, the individual who purchased it and allows other people access to it). There would be many ‘users’ but typically at most only one ‘customer’.

Update: If, as many suggest, the term User originated because it was an early and specific role, what did that role originally convey in terms of the person’s competence, privileges, or restrictions regarding the machine? And does the same role definition thus roughly translate into similar expectations for a customer purchasing an early micro-computer for the home or office?

This is tantamount to asking for an overview essay on the legal history of ownership(/licensing) in software, hardware and content (EULAs, shrinkwrap, DRM…), across multiple decades and jurisdictions, 1980s-current(!) That’s a multi-semester course.

According to Richard Gabriel, users at MIT in the seventies had another more descriptive name.

The right thing is to back out and restore the user program PC to the instruction that invoked the system routine so that resumption of the user program after the interrupt, for example, re-enters the system routine. It is called PC loser-ing because the PC is being coerced into loser mode, where loser is the affectionate name for user at MIT.

The term customer apply when you are in a shop, be it cars or computers and many other things. So there no link between user and customer. In the car industry they design the car for the driver first and then passengers. Not customers. The same for computer they are designed to serve the user and that why customer pay for them.

And actually in my domain, we deal with travel, mostly using a computer to book a trip, but the names we use are not user but mostly travellers. And they are not our clients. Our customers are travel agencies or airlines. But the person that use our application (the end user) are most often traveller that want to book a trip. And the distinction is important. To be successful we must serve wel our end user (travellers) but also our clients (travel agencies & airlines).

Exactly the same when you sell a plane. The customer is often an airline sometime actually a company that rent planes. The users of the plane are pilots, stewards, mechanicians as well as the passengers. Each type of user has a different role.

Even in the car, there also the car mechanic. He is another user and the car is designed in a way that car mechanic can easily repear it.

So I agree with you that “user” is quite a generic term as apposite to traveller, passenger, driver… But this has no link with the notion of customer/owner.

In computer science you have user and thanks to user rights management you actually create groups and name them as you wish depending of your use case. There the generic opposition of system administrators vs users but typically in you company you could give different right to different teams or roles. And the computers would reflect that.

Because of that, because we don’t know what role the users are actually given, because users don’t even need to be a person but can perfectly be computer programs, it make sense to have a very generic term and let the users own choose their names/groups.

After all in a car entertainment system the user is a driver or passenger… In a CD player, it is a listener and so on. Remember there computer in basically everything today.

I suggest looking at this question from a critical point of view. Critical in the sense of not defining the term user from the perspective we have today and reconstruct its historical context to fit our definitions.

One theoretical point of view is to look at how users where constructed with technology. The book How Users Matter edited by Nelly Oudshoorn and Trevor Pinch covers some aspects of the constructivist movement.

In their introduction, they summarize the main claim which is to “go beyond technological determinist views of technology and essentialist views of users’ identities.” What it means that we should not assume that users only exist for the sake of ‘using’ technology but take an active rule in shaping technology as well as being shaped and ‘constructed’ by technology.