Looks like a decent basic summary. Section G.2.6 should probably be called "Multiple Dispatch" rather than "Function Overloading" though. Defining multiple methods of the same function in Julia is different to function overloading in a language like C++ [1].
Perhaps the author was trying to make the overview beginner-friendly by using familiar terms. I have found it helpful to compare and contrast Julia concepts with C++ template concepts. For example, term trait is used in similar ways in both languages, but the big difference is that in Julia type information can be passed in run-time, not just in compile-time.
This is a pretty good overview, and I like it for the same reason I liked the "Half-hour to learn Rust" one from a few days ago - compressed presentation, that's easy to skim through, but without shying away from details too much.
It's strange to see a Julia overview without any mention of multiple dispatch though. "Function overloading" is briefly mentioned, but if this had been a pragmatic overview like the Rust one (instead of one given for use in book exercises), I would have suggested including multiple dispatch, type instability, and the use of `@code_warntype`.
Indeed, Julia's abstract type system with multiple dispatch is its killer feature. It enables generic programming in a beautiful and concise fashion. Here [1] Stefan Karpinski gives some explanations of why multiple dispatch is so effective.
"It's strange to see a Julia overview without any mention of multiple dispatch though."
The link below is an informative recent discussion on OOP vs multiple dispath on the Julia website forums. There is some overlap but I much prefer the multiple dispatch approach:
Discussion: Why does Julia not use class-based OOP?
The thing that has kept me from Julia is the size of the ecosystem, but this was maybe 2 years ago now that I last tried it otherwise I thought the language seemed very nice. How are things doing for ML, NLP specifically if anyone knows?
I've seen this in other Julia discussions. Can someone show me a use case other than trying to cycle over the same data (EDIT: or range) repeatedly where 0-based is fundamentally better than 1-based? I mean, that's the only case I can think of where 0-based is actually going to be easier and more sensible.
But I'd argue that languages should stop having fixed offsets, 1-based and 0-based are both too limiting. Ada is around 40 years old (and it's not unique in this) and it provides arbitrary index ranges and the option to use any discrete type as the index so that you can use whatever index is most natural for your particular problem.
An example from my field: in financial modeling the typical pattern is to calculate an initial cash flow followed by further projected cash flows at fixed intervals (e.g., monthly). The natural way to store a series of cash flows (besides as a column in Excel) is just as an array, and if that array is 0-indexed, the index represents each cash flow's offset from the initial cash flow. This is both more natural and more convenient for lots of calculations - e.g., when calculating the net present value the number of discounting periods for each cash flow is just its index in the array.
I haven't encountered a code base that uses non-default array indices, but it sounds like a serious anti-pattern, especially if you're just changing from 1-based to 0-based.
The Julia translations of these might look as follows:
using OffsetArrays
cash_flows = zeros(0:max_time)
altitude = zeros(Bool, -100:100)
for I in eachindex(cash_flows)
some_function(cash_flows[I])
end
# or in this case avoiding indexing:
for cf in cash_flows
some_function(cf)
end
foreach(some_function, cash_flows)
OffsetArrays is not quite a standard library but it's close. A lot of library code will work like this, calling `eachindex` or `axes` so as to be indifferent to how the arrays are indexed, and to pass this behaviour through to outputs as appropriate.
Thanks. I'll be taking a look a that for my code. I probably won't use this extensively, but it is a feature I have found useful in Ada (though I'm only a hobbyist, I gave up on convincing work to use it). I'm still a novice with Julia, but I'm doing some math heavy code and started exploring it for its relative ease of use (REPL, fast execution after compilation, ergonomics-focused language design).
> I haven't encountered a code base that uses non-default array indices,
If you're referring to <parent>'s mention of Ada with arbitrary ranges of indices, that's less about thinking of them as array indices and more of "a range of indices that make sense for your domain". Thinking about your problem in your problem's space rather than "how would my computer think of this"; making the map more like the territory, as it were.
Apparently you can [0]. However, it seems (in my first reading) to be something I'd avoid because it sounds like the "natural" way to code in Julia is to assume 1-based and you're likely to run into issues with libraries if you try and do anything else.
I stumbled a few times while reading binary data from a file in a 1-based language while I was using a hex editor (0-based) to view the same file. I don't think it proves one way is better than the other, but I suspect it's the common stumbling block that keeps the argument going.
Lua's arrays/tables are more like hash tables, with an ability (still true? been a while since I've done anything with Lua) to optimize if you use a contiguous range starting at 1 for the key. But yes, you can use an arbitrary key.
> Can someone show me a use case other than trying to cycle over the same data (EDIT: or range) repeatedly where 0-based is fundamentally better than 1-based? I mean, that's the only case I can think of where 0-based is actually going to be easier and more sensible.
Perhaps the canonical treatise on this subject:
"Why numbering should start at zero". E. W. Dijkstra, 1982.
I actually had that in my first draft of my comment. But I hadn't read it in a while and just did.
While he endorses 0-based in one of the remarks, he's actually endorsing the notational format of:
a <= x < b or [a,b)
Where if b is renamed N, the length of a vector/array/list, in a 0-based notation, then you'd describe a 0-based vector's range as [0,N). The reasons he gives for preferring this notation for ranges (not, strictly, for 0-based ranges, but for all ranges):
1. Experience at Xerox where this notation (versus the other 3 he describes) leads to fewer errors. An informal study but a study none the less.
2. Using either [a,b) or (a,b], the size of the range is the difference between the provided bounds.
3. Using [a,b), adjacent ranges can be detected where b_1 = a_2. Given ranges [2,13) and [13,20) you can see that they're adjacent by just comparing two values. This certainly makes it quick to visually inspect as a code reader/writer. (the same argument can be made for (a,b])
4. An argument for either [a,b) or [a,b] is that the lower bound should be described by the minimum number in the range because it's more aesthetically pleasing.
So by process of elimination, he's left us with [a,b) as the better notation of the 4 options.
Based on an aesthetic argument, if you accept the above, then 0-based makes more sense because [0,N) is more aesthetically pleasing than [1,N+1). But if you use notation (c) from his report:
a <= x <= b or [a,b]
Then 1-based can be described as the interval [1,N] where N is both the last element and the length of the vector/array/list. Which seems rather pleasant/natural to my eyes and fingers as well.
----------
If we accept the experience at Xerox, then his argument for 0-based indices is reasonable based on the assumption that ranges should be described as [a,b). If we don't accept it, then his argument is mostly based on aesthetics. That is, it's more pleasant to do a computation like:
range size = b - a
than (for ranges described with [a,b]):
range size = b - a + 1
And it's more pleasant to do a comparison like:
adjacent? b_1 = a_2
than (for ranges described with [a,b]):
adjacent? b_1 = a_2 - 1
But that first case doesn't matter in a 1-based array because the range size is just `b`, it's already stated in the range and there's no need for computation (just as it's present in 0-based ranges). Now, if your language permits arbitrary ranges then I think a case could be made for his suggested [a,b) notation. But if you're only choosing between 0-based or 1-based, I don't find it persuasive. It's still a tossup for me, neither is better than the other unless you also choose his notation for describing ranges, where [1,N+1) would be awkward but [1,N] is easier to use and understand.
In languages like C it's useful to have 0-based indexing because it's not really an index at all but a pointer offset. You don't do pointer math in languages like Julia and Matlab (and Fortran), but instead actual indexing, so 1 is a better place to start.
Also, 1-based indexing is easier to teach those new to programming, especially children. 0-based indexing is a significant stumbling block for people, since they are used to counting from 1, which leads to all kinds of off-by-one errors.
This small detail always comes up for some reason.
Anyway, I recently tried implementing some numerical linear algebra algorithms based on descriptions from papers and books. The books and papers all used 1-based indexing. This created some problems for me when I translated the pseudo-code to Python (which is 0-based).
It's really a strength, not a quirk. Negative indexing and array slicing in general are great in Python. Really easy to pick up and way more convenient than any other language that I've come across.
Negative indexing in Python is a dangerous design that hides bugs and causes incorrect programs to return garbage instead of erroring. If you have an incorrect index computation, instead of getting a bounds error, you get different indexing behavior. This makes it dangerously easy to write code that appears to work but computes nonsense.
Without checking, I'm not sure whether a[-1:0:-1] reverses a list. I'm not sure if it includes the first element of the list or not [edit: It doesn't]. I'm not sure why in contrast a[::-1] does reverse a list. I found array slicing (and ranges) to be a source of confusion when programming the aforementioned linear algebra algorithms.
IMO, the Julia approach is better: 3:-1:1 produces [3,2,1]. Both the starting and ending points of the range are included.
The example a[-1:0:-1] should be pretty understandable once you've spent enough time in the language -- you're inclusive at -1 and exclusive at 0, so the list is reversed and is missing its formerly first element (should one exist).
That logic is pretty consistent. The start is always inclusive, so it needs to be `len(a)-`, `-`, or `None`. The end is always exclusive, so you need to choose `None`. As a result, counting all the syntactic sugar available to you, you have 8 slices that can reverse a list. To name a few you have a[::-1], a[None:None:-1], and a[-1::-1].
IMO a much more interesting example for the strengths of inclusive indexes is a[len(a)-1:-1:-1]. The result is always empty, but it wouldn't be too much of a stretch to think you included len(a)-1, you decremented to -1 exclusively (thus including 0), and hence reversed the entire list. The problem is that -1 is a valid index, and unlike the a[0:len(a)] case you don't have any values "before" the beginning of the list to include 0 in an _exclusive_ expression.
It's all a bit of a moot point though. I know Python especially chose its [inclusive,exclusive) convention largely because it wanted expressions like range(len(a)) to not require additional arithmetic for common use cases given that it had zero-based indexing. Julia has one-based indexing, so for common use cases an [inclusive,inclusive] pattern falls out as the most natural choice. I have no idea if Julia actually cared about that sort of thing or if such a convention came about by accident, but it seems like a clean choice for a one-based language.
Ohm, a lot of thought has gone into indexing in Julia. Julia allows indexing by position `A[1,1]`, slicing `A[1:5, :]`, linear indexing `A[1]`, and logical indexing `A[a.==1]`, relative indexing `A[end-1]` and cartesian indexing `A[CartesianIndex(1,1)]`.
The way Julia combines these in a mostly non-conflicting and non-confusing manner is a major engineering feat.
For example the following rule was found to be the just the right balance between permissive and strict behaviour:
* You are permitted to index into arrays with more indices than dimensions, but all trailing indices must be 1.
* You are permitted to index into arrays with fewer indices than dimensions, but the length of all the omitted dimensions must be 1.
I don't know; the grammar is already pretty complicated without introducing additional indexing schemes. Now that you mention it though, I don't know that I've ever written an algorithm mixing and matching negative/positive indices in a way that couldn't be trivially re-written with that kind of syntax. I'm sure such cases exist for somebody...
Looking for solutions, if you're stuck with Python and hate that behavior:
- If you don't need errors raised then a[~i] is already equal to a[-i-1].
- In your own code (or at its boundaries when wrapping external lists) it's nearly free and not much code to subclass list, override __getitem__, and raise errors for negative indices while responding to some syntax like a[0,] or a.R[0].
A mathematician here. The problem is that both 0 and 1 are useful in different situations: when dealing with sequences and series, especially ones arising from discrete models where the important thing to consider is going from one term to the next one (as is always the case when dealing with series), or when sequences are used as discrete approximations of a continuous process, it usually makes much more sense to start indexing with zero. On the other hand, using zero indexing with matrices and vectors leads to madness.
Not the other way around? My experience is that the US uses 1-based indexing for floors in buildings and that Western Europe uses 0-based indexing for floors in buildings.
Corrections welcome, but I think Europeans don’t count floors, they count additional storeys, 1-based.
I think that’s influenced from French, which has parterre (“on the ground”) for the ground floor, and étage, derived from Latin stare, “to stand”, for higher (and lower) storeys (in the end, both may come from Latin)
If Europeans use 0-based counting for floors, I would expect at least some language to say “zeroth floor”. I’m not aware of any.
> If Europeans use 0-based counting for floors, I would expect at least some language to say “zeroth floor”. I’m not aware of any.
I feel that argument does not hold water. Nobody says “second month” or “month two” in English, everyone says “February”. So it sometimes happens that a number is assigned to something, but nobody uses the number in regular speech.
I feel that the ground floor is similar: it's got a number assigned to it, but everybody uses a different word, instead (parterre in French, Erdgeschoss in German, ...).
In Germany at least, it is common to find negative floor numbers, denoting floors that are below ground. For example, an elevator might have these buttons (top to bottom):
6, 5, 4, 3, 2, 1, 0, -1, -2
But it is also common to find some other letters on the buttons, instead of the numbers. E.g.:
P2, P1, 4, 3, 2, 1, E, U1, U2
P2 and P1 would be the car park, E would be Erdgeschoss = parterre = ground floor, U1 and U2 would be “underground” (below ground).
[1] https://www.youtube.com/watch?v=kc9HwsxE1OY&t=392s