Quick Overview of Julia language [pdf]

cameronperot · on Jan 10, 2021

Looks like a decent basic summary. Section G.2.6 should probably be called "Multiple Dispatch" rather than "Function Overloading" though. Defining multiple methods of the same function in Julia is different to function overloading in a language like C++ [1].

[1] https://www.youtube.com/watch?v=kc9HwsxE1OY&t=392s

east2west · on Jan 10, 2021

Perhaps the author was trying to make the overview beginner-friendly by using familiar terms. I have found it helpful to compare and contrast Julia concepts with C++ template concepts. For example, term trait is used in similar ways in both languages, but the big difference is that in Julia type information can be passed in run-time, not just in compile-time.

sundarurfriend · on Jan 10, 2021

This is a pretty good overview, and I like it for the same reason I liked the "Half-hour to learn Rust" one from a few days ago - compressed presentation, that's easy to skim through, but without shying away from details too much.

It's strange to see a Julia overview without any mention of multiple dispatch though. "Function overloading" is briefly mentioned, but if this had been a pragmatic overview like the Rust one (instead of one given for use in book exercises), I would have suggested including multiple dispatch, type instability, and the use of `@code_warntype`.

dandanua · on Jan 10, 2021

Indeed, Julia's abstract type system with multiple dispatch is its killer feature. It enables generic programming in a beautiful and concise fashion. Here [1] Stefan Karpinski gives some explanations of why multiple dispatch is so effective.

[1] https://www.youtube.com/watch?v=kc9HwsxE1OY

open-source-ux · on Jan 10, 2021

"It's strange to see a Julia overview without any mention of multiple dispatch though."

The link below is an informative recent discussion on OOP vs multiple dispath on the Julia website forums. There is some overlap but I much prefer the multiple dispatch approach:

Discussion: Why does Julia not use class-based OOP?

https://discourse.julialang.org/t/is-julias-way-of-oop-super...

ianbutler · on Jan 10, 2021

The thing that has kept me from Julia is the size of the ecosystem, but this was maybe 2 years ago now that I last tried it otherwise I thought the language seemed very nice. How are things doing for ML, NLP specifically if anyone knows?

komuher · on Jan 10, 2021

Still same as it was 2 years ago Flux + some other smaller repos.

nextos · on Jan 11, 2021

Flux + Transformers is a nice combination.

I also feel that Knet, which is quite similar to PyTorch, is a bit underappreciated.

Still, with those two, the functionality that is currently implemented is significantly smaller than what Python offers.

ianbutler · on Jan 10, 2021

Gotcha, thank you

dunefox · on Jan 11, 2021

Use this if there isn't a Julia library available: https://github.com/JuliaPy/PyCall.jl

patrec · on Jan 11, 2021

There is an annoying error:

% is not 'modulo', it's the (less useful) remainder operator.

    julia> -1%9
    -1
    julia> mod(-1, 9)
    8

    python
    >>> -1%9
    8

erwincoumans · on Jan 10, 2021

Like Matlab and Lua, 1-based indexing :(

Jtsummers · on Jan 10, 2021

I've seen this in other Julia discussions. Can someone show me a use case other than trying to cycle over the same data (EDIT: or range) repeatedly where 0-based is fundamentally better than 1-based? I mean, that's the only case I can think of where 0-based is actually going to be easier and more sensible.

But I'd argue that languages should stop having fixed offsets, 1-based and 0-based are both too limiting. Ada is around 40 years old (and it's not unique in this) and it provides arbitrary index ranges and the option to use any discrete type as the index so that you can use whatever index is most natural for your particular problem.

tfehring · on Jan 11, 2021

An example from my field: in financial modeling the typical pattern is to calculate an initial cash flow followed by further projected cash flows at fixed intervals (e.g., monthly). The natural way to store a series of cash flows (besides as a column in Excel) is just as an array, and if that array is 0-indexed, the index represents each cash flow's offset from the initial cash flow. This is both more natural and more convenient for lots of calculations - e.g., when calculating the net present value the number of discounting periods for each cash flow is just its index in the array.

I haven't encountered a code base that uses non-default array indices, but it sounds like a serious anti-pattern, especially if you're just changing from 1-based to 0-based.

Jtsummers · on Jan 11, 2021

Somehow I missed this comment. But as michaelcampbell wrote, in Ada you'd select the range that made sense for your application.

So if your application makes sense with 0-based (time series in your example) then you can do it:

  Cash_Flows : array (0..max_time) of Double;

It may even make sense to go negative for some things. For instance:

  Object_at_Altitude : array (-100..1000) of Object;

Or characters:

  Selected_Answers : array (range 'a'..'z') of Boolean;

Or a type that you've created:

  type Color is Red, Green, Blue, Yellow, Orange;
  Histogram_of_Colors : array (Color'Range) of Natural;

But a particularly nice thing is that arrays in Ada, unlike C, don't lose their size, so you can always do:

  for I in Cash_Flows'Range loop
    Some_Function(Cash_Flows(I));
  end loop;

So really the index never even has to be directly touched or known after creation.

mcabbott · on Jan 11, 2021

The Julia translations of these might look as follows:

    using OffsetArrays
    
    cash_flows = zeros(0:max_time)
    
    altitude = zeros(Bool, -100:100)
    
    for I in eachindex(cash_flows)
        some_function(cash_flows[I])
    end
    # or in this case avoiding indexing:
    for cf in cash_flows
        some_function(cf)
    end
    foreach(some_function, cash_flows)

OffsetArrays is not quite a standard library but it's close. A lot of library code will work like this, calling `eachindex` or `axes` so as to be indifferent to how the arrays are indexed, and to pass this behaviour through to outputs as appropriate.

Jtsummers · on Jan 11, 2021

Thanks. I'll be taking a look a that for my code. I probably won't use this extensively, but it is a feature I have found useful in Ada (though I'm only a hobbyist, I gave up on convincing work to use it). I'm still a novice with Julia, but I'm doing some math heavy code and started exploring it for its relative ease of use (REPL, fast execution after compilation, ergonomics-focused language design).

michaelcampbell · on Jan 11, 2021

> I haven't encountered a code base that uses non-default array indices,

If you're referring to <parent>'s mention of Ada with arbitrary ranges of indices, that's less about thinking of them as array indices and more of "a range of indices that make sense for your domain". Thinking about your problem in your problem's space rather than "how would my computer think of this"; making the map more like the territory, as it were.

patagurbon · on Jan 10, 2021

I believe Julia lets you do this, although I'm not sure about the ease or generality of it.

Jtsummers · on Jan 10, 2021

Apparently you can [0]. However, it seems (in my first reading) to be something I'd avoid because it sounds like the "natural" way to code in Julia is to assume 1-based and you're likely to run into issues with libraries if you try and do anything else.

[0] https://docs.julialang.org/en/v1/devdocs/offset-arrays/

eigenspace · on Jan 10, 2021

Julia is all about generic programming. There is a ton of support for offset arrays and people almost always try to support that style.

tvb12 · on Jan 10, 2021

I stumbled a few times while reading binary data from a file in a 1-based language while I was using a hex editor (0-based) to view the same file. I don't think it proves one way is better than the other, but I suspect it's the common stumbling block that keeps the argument going.

warlog · on Jan 10, 2021

If I'm not mistaken, Lua let's you assign any key (index) you want.

Jtsummers · on Jan 10, 2021

Lua's arrays/tables are more like hash tables, with an ability (still true? been a while since I've done anything with Lua) to optimize if you use a contiguous range starting at 1 for the key. But yes, you can use an arbitrary key.

hobby-coder-guy · on Jan 11, 2021

Still true.

ogogmad · on Jan 10, 2021

(lets not let's)

xdavidliu · on Jan 11, 2021

> Can someone show me a use case other than trying to cycle over the same data (EDIT: or range) repeatedly where 0-based is fundamentally better than 1-based? I mean, that's the only case I can think of where 0-based is actually going to be easier and more sensible.

Perhaps the canonical treatise on this subject: "Why numbering should start at zero". E. W. Dijkstra, 1982.

https://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF

Jtsummers · on Jan 11, 2021

I actually had that in my first draft of my comment. But I hadn't read it in a while and just did.

While he endorses 0-based in one of the remarks, he's actually endorsing the notational format of:

  a <= x < b or [a,b)

Where if b is renamed N, the length of a vector/array/list, in a 0-based notation, then you'd describe a 0-based vector's range as [0,N). The reasons he gives for preferring this notation for ranges (not, strictly, for 0-based ranges, but for all ranges):

1. Experience at Xerox where this notation (versus the other 3 he describes) leads to fewer errors. An informal study but a study none the less.

2. Using either [a,b) or (a,b], the size of the range is the difference between the provided bounds.

3. Using [a,b), adjacent ranges can be detected where b_1 = a_2. Given ranges [2,13) and [13,20) you can see that they're adjacent by just comparing two values. This certainly makes it quick to visually inspect as a code reader/writer. (the same argument can be made for (a,b])

4. An argument for either [a,b) or [a,b] is that the lower bound should be described by the minimum number in the range because it's more aesthetically pleasing.

So by process of elimination, he's left us with [a,b) as the better notation of the 4 options.

Based on an aesthetic argument, if you accept the above, then 0-based makes more sense because [0,N) is more aesthetically pleasing than [1,N+1). But if you use notation (c) from his report:

  a <= x <= b or [a,b]

Then 1-based can be described as the interval [1,N] where N is both the last element and the length of the vector/array/list. Which seems rather pleasant/natural to my eyes and fingers as well.

----------

If we accept the experience at Xerox, then his argument for 0-based indices is reasonable based on the assumption that ranges should be described as [a,b). If we don't accept it, then his argument is mostly based on aesthetics. That is, it's more pleasant to do a computation like:

  range size = b - a

than (for ranges described with [a,b]):

  range size = b - a + 1

And it's more pleasant to do a comparison like:

  adjacent? b_1 = a_2

than (for ranges described with [a,b]):

  adjacent? b_1 = a_2 - 1

But that first case doesn't matter in a 1-based array because the range size is just `b`, it's already stated in the range and there's no need for computation (just as it's present in 0-based ranges). Now, if your language permits arbitrary ranges then I think a case could be made for his suggested [a,b) notation. But if you're only choosing between 0-based or 1-based, I don't find it persuasive. It's still a tossup for me, neither is better than the other unless you also choose his notation for describing ranges, where [1,N+1) would be awkward but [1,N] is easier to use and understand.

alentist · on Jan 16, 2021

> Then 1-based can be described as the interval [1,N] where N is both the last element and the length of the vector/array/list.

Doesn't work when the start position isn't 1.

> But that first case doesn't matter in a 1-based array because the range size is just `b`

Only if a = 1.

ModernMech · on Jan 11, 2021

In languages like C it's useful to have 0-based indexing because it's not really an index at all but a pointer offset. You don't do pointer math in languages like Julia and Matlab (and Fortran), but instead actual indexing, so 1 is a better place to start.

Also, 1-based indexing is easier to teach those new to programming, especially children. 0-based indexing is a significant stumbling block for people, since they are used to counting from 1, which leads to all kinds of off-by-one errors.

michaelcampbell · on Jan 11, 2021

All this discussion and Stan Kelly-Bootle's quote hasn't come up yet? Shameful.

(For the uninitiated:

> Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought, proper consideration. -- Stan Kelly-Bootle

)

QuesnayJr · on Jan 11, 2021

I have never heard that quote before. That's great.

ogogmad · on Jan 10, 2021

This small detail always comes up for some reason.

Anyway, I recently tried implementing some numerical linear algebra algorithms based on descriptions from papers and books. The books and papers all used 1-based indexing. This created some problems for me when I translated the pseudo-code to Python (which is 0-based).

amelius · on Jan 10, 2021

Python has other quirks, such as a[-1] being the last element of the array.

st1x7 · on Jan 11, 2021

It's really a strength, not a quirk. Negative indexing and array slicing in general are great in Python. Really easy to pick up and way more convenient than any other language that I've come across.

StefanKarpinski · on Jan 11, 2021

Negative indexing in Python is a dangerous design that hides bugs and causes incorrect programs to return garbage instead of erroring. If you have an incorrect index computation, instead of getting a bounds error, you get different indexing behavior. This makes it dangerously easy to write code that appears to work but computes nonsense.

ogogmad · on Jan 11, 2021

> array slicing in general) are great in Python

Without checking, I'm not sure whether a[-1:0:-1] reverses a list. I'm not sure if it includes the first element of the list or not [edit: It doesn't]. I'm not sure why in contrast a[::-1] does reverse a list. I found array slicing (and ranges) to be a source of confusion when programming the aforementioned linear algebra algorithms.

IMO, the Julia approach is better: 3:-1:1 produces [3,2,1]. Both the starting and ending points of the range are included.

hansvm · on Jan 11, 2021

The example a[-1:0:-1] should be pretty understandable once you've spent enough time in the language -- you're inclusive at -1 and exclusive at 0, so the list is reversed and is missing its formerly first element (should one exist).

That logic is pretty consistent. The start is always inclusive, so it needs to be `len(a)-`, `-`, or `None`. The end is always exclusive, so you need to choose `None`. As a result, counting all the syntactic sugar available to you, you have 8 slices that can reverse a list. To name a few you have a[::-1], a[None:None:-1], and a[-1::-1].

IMO a much more interesting example for the strengths of inclusive indexes is a[len(a)-1:-1:-1]. The result is always empty, but it wouldn't be too much of a stretch to think you included len(a)-1, you decremented to -1 exclusively (thus including 0), and hence reversed the entire list. The problem is that -1 is a valid index, and unlike the a[0:len(a)] case you don't have any values "before" the beginning of the list to include 0 in an _exclusive_ expression.

It's all a bit of a moot point though. I know Python especially chose its [inclusive,exclusive) convention largely because it wanted expressions like range(len(a)) to not require additional arithmetic for common use cases given that it had zero-based indexing. Julia has one-based indexing, so for common use cases an [inclusive,inclusive] pattern falls out as the most natural choice. I have no idea if Julia actually cared about that sort of thing or if such a convention came about by accident, but it seems like a clean choice for a one-based language.

shele · on Jan 11, 2021

Ohm, a lot of thought has gone into indexing in Julia. Julia allows indexing by position `A[1,1]`, slicing `A[1:5, :]`, linear indexing `A[1]`, and logical indexing `A[a.==1]`, relative indexing `A[end-1]` and cartesian indexing `A[CartesianIndex(1,1)]`.

The way Julia combines these in a mostly non-conflicting and non-confusing manner is a major engineering feat.

For example the following rule was found to be the just the right balance between permissive and strict behaviour:

* You are permitted to index into arrays with more indices than dimensions, but all trailing indices must be 1.

* You are permitted to index into arrays with fewer indices than dimensions, but the length of all the omitted dimensions must be 1.

amelius · on Jan 11, 2021

Do you really think so? I'd personally feel more reassured if a[-1] caused an assertion to fail.

Python should have used a different notation, imho. Like: a[<0] for the last element.

hansvm · on Jan 11, 2021

I don't know; the grammar is already pretty complicated without introducing additional indexing schemes. Now that you mention it though, I don't know that I've ever written an algorithm mixing and matching negative/positive indices in a way that couldn't be trivially re-written with that kind of syntax. I'm sure such cases exist for somebody...

Looking for solutions, if you're stuck with Python and hate that behavior:

- If you don't need errors raised then a[~i] is already equal to a[-i-1].

- In your own code (or at its boundaries when wrapping external lists) it's nearly free and not much code to subclass list, override __getitem__, and raise errors for negative indices while responding to some syntax like a[0,] or a.R[0].

rscho · on Jan 10, 2021

Most languages designed for scientific (vs. business) use cases are 1-based. Mathematicians prefer it to zero.

lahvak · on Jan 11, 2021

A mathematician here. The problem is that both 0 and 1 are useful in different situations: when dealing with sequences and series, especially ones arising from discrete models where the important thing to consider is going from one term to the next one (as is always the case when dealing with series), or when sequences are used as discrete approximations of a continuous process, it usually makes much more sense to start indexing with zero. On the other hand, using zero indexing with matrices and vectors leads to madness.

st1x7 · on Jan 11, 2021

I need to switch between zero- and one-indexed languages often. It really doesn't make a difference.

warlog · on Jan 10, 2021

And R!

1-based works for folks that count things (Scientists).

tester756 · on Jan 10, 2021

Sometimes I do wonder... maybe 1 based indexing is better and would cause less off by 1 bugs for non-US devs?

johnmyleswhite · on Jan 11, 2021

Not the other way around? My experience is that the US uses 1-based indexing for floors in buildings and that Western Europe uses 0-based indexing for floors in buildings.

Someone · on Jan 11, 2021

Corrections welcome, but I think Europeans don’t count floors, they count additional storeys, 1-based.

I think that’s influenced from French, which has parterre (“on the ground”) for the ground floor, and étage, derived from Latin stare, “to stand”, for higher (and lower) storeys (in the end, both may come from Latin)

If Europeans use 0-based counting for floors, I would expect at least some language to say “zeroth floor”. I’m not aware of any.

hibbelig · on Jan 11, 2021

> If Europeans use 0-based counting for floors, I would expect at least some language to say “zeroth floor”. I’m not aware of any.

I feel that argument does not hold water. Nobody says “second month” or “month two” in English, everyone says “February”. So it sometimes happens that a number is assigned to something, but nobody uses the number in regular speech.

I feel that the ground floor is similar: it's got a number assigned to it, but everybody uses a different word, instead (parterre in French, Erdgeschoss in German, ...).

In Germany at least, it is common to find negative floor numbers, denoting floors that are below ground. For example, an elevator might have these buttons (top to bottom):

6, 5, 4, 3, 2, 1, 0, -1, -2

But it is also common to find some other letters on the buttons, instead of the numbers. E.g.:

P2, P1, 4, 3, 2, 1, E, U1, U2

P2 and P1 would be the car park, E would be Erdgeschoss = parterre = ground floor, U1 and U2 would be “underground” (below ground).

dunefox · on Jan 11, 2021

As a German, it's Erdgeschoß for ground floor, then first, second, third, etc. for the floors above ground.

ninjin · on Jan 11, 2021

Indeed:

https://en.wikipedia.org/wiki/Storey#Numbering

QuesnayJr · on Jan 11, 2021

Norway uses 1-based, which confused me once I was used to 0-based.

cambalache · on Jan 10, 2021

As it should be.0-based indexing is at best another equally valid convention. It is like the other eternal debate about what number is ground floor.

Mauricebranagh · on Jan 11, 2021

And Fortran before them, it makes sense for technical programming

ur-whale · on Jan 11, 2021

> Like Matlab and Lua, 1-based indexing :(

This should be the first line of the document.