How to measure programming language succinctness?

General Tech Learning Aids/Tools 2 years ago

0 2 0 0 0 tuteeHUB earn credit +10 pts

5 Star Rating 1 Rating

Posted on 16 Aug 2022, this text provides information on Learning Aids/Tools related to General Tech. Please note that while accuracy is prioritized, the data presented might not be entirely correct or up-to-date. This information is offered for general knowledge and informational purposes only, and should not be considered as a substitute for professional advice.

Take Quiz To Earn Credits!

Turn Your Knowledge into Earnings.

tuteehub_quiz

Answers (2)

Post Answer
profilepic.png
manpreet Tuteehub forum best answer Best Answer 2 years ago

 

I want to explore the notion of quantifying the amount of succinctness a programming language provides. That is, the amount a high-level language reduces the complex.

This idea of "simplification" is a factor of text-wise reduction (fewer characters needed to express a complex concept, à la Algorithmic Information Theory) and another, less easy-to-quantify concept of maintainability. Fleshing out this latter concept, it is clear it has to do with how easily one can establish programmer consensus for the given task (i.e. how many programmers of the language would put it back the same way you've expressed it or otherwise agree on the best implementation between different implementations of the same problem?).

I will define the "Kolmogorov Quotient" so that higher numbers for a given language denote a reduction in the complexity of solving the problem in the given language.

The metric for "text-wise reduction" should incorporate a constant factor based on (non-identifier) symbols used in the language and source text. These factors will be the same across all languages implemented (i.e. designed) for a given architecture (e.g. VonNeumann architecture vs. Symbolics) and will be a measure of the significance of the symbol; i.e. the topology of the language tokens. (These terms, alas, will also need to be fleshed out and defined.)

Once the basic premise and a methodology above is agreed to, it is only a matter of a rough constant of difference for any specific implementation/architecture. (That is, as long as the architecture is the same across all measurements, the number should be valid and comparable between languages.)

But it could go something like this: Pick a language "close to the machine", like C or Assembly, and measure the amount of bytes of machine code it used to implement a standard suite(*) of common, non-threaded programming tasks (base_language_count). Then code the exact same functionality in the language you are wanting to measure (without using external libraries) and count the number of bytes of source code (test_language_count).

KQuotient = (base_language_count / test_language_count) / number_of_equal_programs.

"number_of_equal_programs" is the number of programs fluent programmers of the language agree that are the best and equal solutions to the problem. (I will define "equal programs" as those who's output is the same for every input.)

The first ratio should always be greater than 1.0. My only concern is that for each programming language, one could reduce the KQuotient number simply by changing each keyword to a single character.

(*) "standard suite of common programming tasks...": I see two main categories:

  1. Data-processing suite, limited to simple text I/O (computation towards the machine)
  2. GUI suite (computation towards the user)

The purpose of this idea is to end the tiring "language wars" about whose language is the best. By giving a quantitative metric, people can at least argue better.

profilepic.png
manpreet 2 years ago

 

The ideas the question expresses are interesting but maybe insufficiently fleshed out. I can see a couple of points that deserve further refinement.

  • It is difficult to "code the exact same functionality". In part that is because what counts as "the exact same functionality" depends on the chosen notion of program equivalence. For terminating programs in sequential programming languages there's a canonical definition, but when you move to concurrent, non-terminating programs, canonicity vanishes, and you are left with multiple, reasonable choices that are mutually incompatible. Secondly, but related to the first point, different programming languages are fundamentally different: things you can do in a low-level language can often not be done in a high-level language. A typical example is to do with speed. Certain tasks can be solved intrinsically faster in assembly language than in e.g. Prolog or Javascript (assuming conventional compilation).

  • The KQuotient is subjective, in the sense that for any given task, programmers A and B will likely have different length base and test language implementations.

  • Moreover, the KQuotient is also task dependent, meaning that, given a pair of languages, the KQuotient for Task A will in general be different from the KQuotient of Task B.

Finally, even though I find the idea of measuring maintainability via query numbers to reach consensus intriguing, I suggest to be more clear about what kind of queries are relevant here (syntax? semantics?), and how the concept of programmer consensus can be made precise.


0 views   0 shares

No matter what stage you're at in your education or career, TuteeHub will help you reach the next level that you're aiming for. Simply,Choose a subject/topic and get started in self-paced practice sessions to improve your knowledge and scores.