Project organization and variable scope question

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Project organization and variable scope question

Nicholas Mueschke
To start, I'm new to Julia and I'm trying things out to test Julia out for some scientific/engineering applications.  In particular, I'm working on moderate size projects, where they're big enough that I'll need more than one file with code in it to stay organized (let's say anywhere from 5-50 files, 10s-100s of functions.  However, I'm struggling to figure out an appropriate way to organize my code and ensure that the proper variables are in scope where they are needed.

To start, I come from a Matlab/C#/C++/Fortran/Basic/Pascal/etc. background and have been coding for long time, so I'm a little baffled by Julia's structure.

Here's the basic description of my problem.  I've got a collection of (sometimes large) 1D arrays.  I'll define some starting values for these arrays and then I simply iterate on them and update the values in the arrays (basically I'm solving unsteady PDE problems).  A very simple program structure would look something like this (thinking in terms of a functional programming approach in Matlab/C/Fortran,):


--------------------------------------------------------------------------------------------
module MyProjectModule

# Include some files that have functions I need
include("SetSomeProblemParameters.jl");
include("DeclareASetOfArrays.jl");
include("AssignInitialValuesToArrays.jl");
include("CheckSomeValuesInSomeArrays.jl");
include("CalculateSomeValuesBasedUponArrayValues.jl");
include("UpdateValuesOfArrays.jl");
include("WriteResultsToDisk.jl");

# Make main() visible
export main


main(NumberOfIterations)  # This is the main entry point of the code that performs a lengthy numerical calculation

SetSomeProblemParameters()
DeclareASetOfArrays()
AssignInitialValuesToArrays()

while (NumberOfIterations not reached)
  CheckSomeValuesInSomeArrays()
  CalculateSomeValuesBasedUponArrayValues()
  UpdateValuesOfArrays()

  NumberOfIterations++
end #while

WriteResultsToDisk()

end #main

end MyProjectModule
--------------------------------------------------------------------------------------------


Simple, right?

Question #1:  If main entry point to run a calculation "main()" is a function, it gets its own variable workspace, right?  Now, if I write a script (not a function) and use include("some_script.jl") with main(), does Julia just inline that code within main()?  In terms of scope, should the script file be able to see all of the variables in the scope of main()?  In Matlab that would be true.  In Fortran/C that wouldn't.  I guess, I'm not sure what scope implications there are for Julia script files.

Question #2:  If I've defined a bunch of functions as shown in the pseudocode above, what is the most performant way to have the large 1D arrays accessible within the scope of each function.  As you can tell, I'm trying to avoid writing functions that accept a long list of input parameters.  The old Fortran solution is to simply make the arrays global, so that each function can access them as needed.  How terrible is that idea within the Julia framework?  Also, how can I even do that?  I've tried writing a script (not a function) to declare a long list of global variables and then used include("DeclareGlobalVariables,jl) within my main.  But, when I return to main(), those variables do not show up in the workspace for main???  What am I missing?

Question #3: I come from a VisualStudio IDE background, so I'm having trouble figuring out how to organize a Juila project.  I'm trying out Atom for my first Julia tests.  For a project that's bigger than just a script or a few functions, should I be defining a defining main entry point function within a module?  Why Does Julia force modules to be added as packages so they can be loaded with the "using" command?  That seems strange.  Or, should I just write everything as a collection of files with functions in them and not worry about modules?  Simple REPL and one file Julia examples are everywhere.  There are also large coding projects/libraries/utilities on github as examples, but I'm having trouble figuring out the structure of these larger projects.  I guess, I'm somewhere in between these two cases, where I'm just want to crunch some numbers, but I'm a little more complicated/sophisticated than the single file examples.  What's the best way to proceed with such a project/file structure?

Thanks in advance for any help.

Nick






Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project organization and variable scope question

Ralph Smith
Unlike Matlab, Julia doesn't give special treatment to script files.
By "script" we usually mean a file containing a series of expressions
(which can include type and function definitions) intended for
evaluation in the Main module.  The Main module is where expressions
typed interactively at the REPL or IDE, or read from the file
specified on the Julia command line, are evaluated. Source files
implementing parts of other modules differ only in intention.

1) Each module has its own so-called "global" scope, which is where the
contents of "included" files are evaluated. Thus "include" does not
just paste text into your function, as it would in Fortran or C. In
fact, the inclusion is done when the surrounding code is *run*, so the
module being modified may not be the one where the function is
defined; "include" should not normally appear in functions. This
behavior is not yet clearly documented.

2) Global variables have some performance drawbacks, so are not
recommended for your situation.  What I do is to define a composite
type (e.g. "Context") containing most of the run parameters and
widely needed arrays, construct an instance of it in my "main()"
function, and pass the "context" variable as an argument to everything
else.  The "Parameters" package provides some nice macros and methods
for usage like this.

3) Most of us find it wise to organize the pieces of even mid-size
projects into modules. You don't need to make them full packages:
just "include" the top-level module files in your main script. Then
you can access their components with "MyModule.thing", or via "using"
and "import" if you prefer.

Finally a couple of syntax points: you need the word "function" in the
definition of "main()", and Julia doesn't have the "++" operator.



On Friday, November 18, 2016 at 1:25:05 PM UTC-5, Nicholas Mueschke wrote:
Question #1:  If main entry point to run a calculation "main()" is a function, it gets its own variable workspace, right?  Now, if I write a script (not a function) and use include("some_script.jl") with main(), does Julia just inline that code within main()?  In terms of scope, should the script file be able to see all of the variables in the scope of main()?  In Matlab that would be true.  In Fortran/C that wouldn't.  I guess, I'm not sure what scope implications there are for Julia script files.

Question #2:  If I've defined a bunch of functions as shown in the pseudocode above, what is the most performant way to have the large 1D arrays accessible within the scope of each function.  As you can tell, I'm trying to avoid writing functions that accept a long list of input parameters.  The old Fortran solution is to simply make the arrays global, so that each function can access them as needed.  How terrible is that idea within the Julia framework?  Also, how can I even do that?  I've tried writing a script (not a function) to declare a long list of global variables and then used include("DeclareGlobalVariables,jl) within my main.  But, when I return to main(), those variables do not show up in the workspace for main???  What am I missing?

Question #3: I come from a VisualStudio IDE background, so I'm having trouble figuring out how to organize a Juila project.  I'm trying out Atom for my first Julia tests.  For a project that's bigger than just a script or a few functions, should I be defining a defining main entry point function within a module?  Why Does Julia force modules to be added as packages so they can be loaded with the "using" command?  That seems strange.  Or, should I just write everything as a collection of files with functions in them and not worry about modules?  Simple REPL and one file Julia examples are everywhere.  There are also large coding projects/libraries/utilities on github as examples, but I'm having trouble figuring out the structure of these larger projects.  I guess, I'm somewhere in between these two cases, where I'm just want to crunch some numbers, but I'm a little more complicated/sophisticated than the single file examples.  What's the best way to proceed with such a project/file structure?

Thanks in advance for any help.

Nick






Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project organization and variable scope question

Mauro
In reply to this post by Nicholas Mueschke
Welcome to Julia!

Without having read your post in detail, your example looks like your
using global variables to hold the state as you're not passing around
anything.  Apart from bad style in general, in Julia this is also bad
for performance (read [1]).  The compiler cannot work well with globals.
If you need them, declare their binding `const`.  If you do need to use
globals from another module then you need to qualify them as a module
introduces a name-space, e.g. `MyModule.global1`.  Also have you read
http://docs.julialang.org/en/release-0.5/manual/variables-and-scoping/?

(Note that most discussion seems to have moved to
https://discourse.julialang.org/.  If my answer didn't help, maybe post
there again.  Although a shorter post might help to cater to people's
short attention span.)

[1] http://docs.julialang.org/en/release-0.5/manual/performance-tips/

On Fri, 2016-11-18 at 11:22, Nicholas Mueschke <[hidden email]> wrote:

> To start, I'm new to Julia and I'm trying things out to test Julia out for
> some scientific/engineering applications.  In particular, I'm working on
> moderate size projects, where they're big enough that I'll need more than
> one file with code in it to stay organized (let's say anywhere from 5-50
> files, 10s-100s of functions.  However, I'm struggling to figure out an
> appropriate way to organize my code and ensure that the proper variables
> are in scope where they are needed.
>
> To start, I come from a Matlab/C#/C++/Fortran/Basic/Pascal/etc. background
> and have been coding for long time, so I'm a little baffled by Julia's
> structure.
>
> Here's the basic description of my problem.  I've got a collection of
> (sometimes large) 1D arrays.  I'll define some starting values for these
> arrays and then I simply iterate on them and update the values in the
> arrays (basically I'm solving unsteady PDE problems).  A very simple
> program structure would look something like this (thinking in terms of a
> functional programming approach in Matlab/C/Fortran,):
>
>
> --------------------------------------------------------------------------------------------
> module MyProjectModule
>
> # Include some files that have functions I need
> include("SetSomeProblemParameters.jl");
> include("DeclareASetOfArrays.jl");
> include("AssignInitialValuesToArrays.jl");
> include("CheckSomeValuesInSomeArrays.jl");
> include("CalculateSomeValuesBasedUponArrayValues.jl");
> include("UpdateValuesOfArrays.jl");
> include("WriteResultsToDisk.jl");
>
> # Make main() visible
> export main
>
>
> main(NumberOfIterations)  # This is the main entry point of the code that
> performs a lengthy numerical calculation
>
> SetSomeProblemParameters()
> DeclareASetOfArrays()
> AssignInitialValuesToArrays()
>
> while (NumberOfIterations not reached)
>   CheckSomeValuesInSomeArrays()
>   CalculateSomeValuesBasedUponArrayValues()
>   UpdateValuesOfArrays()
>
>   NumberOfIterations++
> end #while
>
> WriteResultsToDisk()
>
> end #main
>
> end MyProjectModule
> --------------------------------------------------------------------------------------------
>
>
> Simple, right?
>
> Question #1:  If main entry point to run a calculation "main()" is a
> function, it gets its own variable workspace, right?  Now, if I write a
> script (not a function) and use include("some_script.jl") with main(), does
> Julia just inline that code within main()?  In terms of scope, should the
> script file be able to see all of the variables in the scope of main()?  In
> Matlab that would be true.  In Fortran/C that wouldn't.  I guess, I'm not
> sure what scope implications there are for Julia script files.
>
> Question #2:  If I've defined a bunch of functions as shown in the
> pseudocode above, what is the most performant way to have the large 1D
> arrays accessible within the scope of each function.  As you can tell, I'm
> trying to avoid writing functions that accept a long list of input
> parameters.  The old Fortran solution is to simply make the arrays global,
> so that each function can access them as needed.  How terrible is that idea
> within the Julia framework?  Also, how can I even do that?  I've tried
> writing a script (not a function) to declare a long list of global
> variables and then used include("DeclareGlobalVariables,jl) within my main.
>  But, when I return to main(), those variables do not show up in the
> workspace for main???  What am I missing?
>
> Question #3: I come from a VisualStudio IDE background, so I'm having
> trouble figuring out how to organize a Juila project.  I'm trying out Atom
> for my first Julia tests.  For a project that's bigger than just a script
> or a few functions, should I be defining a defining main entry point
> function within a module?  Why Does Julia force modules to be added as
> packages so they can be loaded with the "using" command?  That seems
> strange.  Or, should I just write everything as a collection of files with
> functions in them and not worry about modules?  Simple REPL and one file
> Julia examples are everywhere.  There are also large coding
> projects/libraries/utilities on github as examples, but I'm having trouble
> figuring out the structure of these larger projects.  I guess, I'm
> somewhere in between these two cases, where I'm just want to crunch some
> numbers, but I'm a little more complicated/sophisticated than the single
> file examples.  What's the best way to proceed with such a project/file
> structure?
>
> Thanks in advance for any help.
>
> Nick
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project organization and variable scope question

Nicholas Mueschke
In reply to this post by Ralph Smith
Ralph... thanks for the clarifications and suggestions.  I'll test them out.

Nick

On Saturday, November 19, 2016 at 11:47:23 AM UTC-6, Ralph Smith wrote:
Unlike Matlab, Julia doesn't give special treatment to script files.
By "script" we usually mean a file containing a series of expressions
(which can include type and function definitions) intended for
evaluation in the Main module.  The Main module is where expressions
typed interactively at the REPL or IDE, or read from the file
specified on the Julia command line, are evaluated. Source files
implementing parts of other modules differ only in intention.

1) Each module has its own so-called "global" scope, which is where the
contents of "included" files are evaluated. Thus "include" does not
just paste text into your function, as it would in Fortran or C. In
fact, the inclusion is done when the surrounding code is *run*, so the
module being modified may not be the one where the function is
defined; "include" should not normally appear in functions. This
behavior is not yet clearly documented.

2) Global variables have some performance drawbacks, so are not
recommended for your situation.  What I do is to define a composite
type (e.g. "Context") containing most of the run parameters and
widely needed arrays, construct an instance of it in my "main()"
function, and pass the "context" variable as an argument to everything
else.  The "Parameters" package provides some nice macros and methods
for usage like this.

3) Most of us find it wise to organize the pieces of even mid-size
projects into modules. You don't need to make them full packages:
just "include" the top-level module files in your main script. Then
you can access their components with "MyModule.thing", or via "using"
and "import" if you prefer.

Finally a couple of syntax points: you need the word "function" in the
definition of "main()", and Julia doesn't have the "++" operator.



On Friday, November 18, 2016 at 1:25:05 PM UTC-5, Nicholas Mueschke wrote:
Question #1:  If main entry point to run a calculation "main()" is a function, it gets its own variable workspace, right?  Now, if I write a script (not a function) and use include("some_script.jl") with main(), does Julia just inline that code within main()?  In terms of scope, should the script file be able to see all of the variables in the scope of main()?  In Matlab that would be true.  In Fortran/C that wouldn't.  I guess, I'm not sure what scope implications there are for Julia script files.

Question #2:  If I've defined a bunch of functions as shown in the pseudocode above, what is the most performant way to have the large 1D arrays accessible within the scope of each function.  As you can tell, I'm trying to avoid writing functions that accept a long list of input parameters.  The old Fortran solution is to simply make the arrays global, so that each function can access them as needed.  How terrible is that idea within the Julia framework?  Also, how can I even do that?  I've tried writing a script (not a function) to declare a long list of global variables and then used include("DeclareGlobalVariables,jl) within my main.  But, when I return to main(), those variables do not show up in the workspace for main???  What am I missing?

Question #3: I come from a VisualStudio IDE background, so I'm having trouble figuring out how to organize a Juila project.  I'm trying out Atom for my first Julia tests.  For a project that's bigger than just a script or a few functions, should I be defining a defining main entry point function within a module?  Why Does Julia force modules to be added as packages so they can be loaded with the "using" command?  That seems strange.  Or, should I just write everything as a collection of files with functions in them and not worry about modules?  Simple REPL and one file Julia examples are everywhere.  There are also large coding projects/libraries/utilities on github as examples, but I'm having trouble figuring out the structure of these larger projects.  I guess, I'm somewhere in between these two cases, where I'm just want to crunch some numbers, but I'm a little more complicated/sophisticated than the single file examples.  What's the best way to proceed with such a project/file structure?

Thanks in advance for any help.

Nick






Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project organization and variable scope question

Nicholas Mueschke
In reply to this post by Mauro
Mauro,

Thanks.  I've read the scope and performance sections you've mentioned.  Part of the problem comes from the often competing approaches of scientific/numerical computing and well-versed/structured programming practices.  I have my hand in both pots at times.  While global variables are viewed as horrible style by anyone with a formal education in programming science, it still lives on in glorious fashion (along with everything else Fortran 77 and Fortran 90) in modern scientific codes.  I'm fine with migrating some legacy codes that use such global variable approaches to newer formats/styles.  But, at the moment, I'm just trying walk before I run.

Nick

On Saturday, November 19, 2016 at 11:51:41 AM UTC-6, Mauro wrote:
Welcome to Julia!

Without having read your post in detail, your example looks like your
using global variables to hold the state as you're not passing around
anything.  Apart from bad style in general, in Julia this is also bad
for performance (read [1]).  The compiler cannot work well with globals.
If you need them, declare their binding `const`.  If you do need to use
globals from another module then you need to qualify them as a module
introduces a name-space, e.g. `MyModule.global1`.  Also have you read
<a href="http://docs.julialang.org/en/release-0.5/manual/variables-and-scoping/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fdocs.julialang.org%2Fen%2Frelease-0.5%2Fmanual%2Fvariables-and-scoping%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGgqZSeu3W6jTotwCS2aVtrrDYb8g&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fdocs.julialang.org%2Fen%2Frelease-0.5%2Fmanual%2Fvariables-and-scoping%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGgqZSeu3W6jTotwCS2aVtrrDYb8g&#39;;return true;">http://docs.julialang.org/en/release-0.5/manual/variables-and-scoping/?

(Note that most discussion seems to have moved to
<a href="https://discourse.julialang.org/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fdiscourse.julialang.org%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEhN-iEZFZs7qCF4J5gQcZoKla6JA&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fdiscourse.julialang.org%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEhN-iEZFZs7qCF4J5gQcZoKla6JA&#39;;return true;">https://discourse.julialang.org/.  If my answer didn't help, maybe post
there again.  Although a shorter post might help to cater to people's
short attention span.)

[1] <a href="http://docs.julialang.org/en/release-0.5/manual/performance-tips/" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fdocs.julialang.org%2Fen%2Frelease-0.5%2Fmanual%2Fperformance-tips%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEW9yEVFj7as4ANI2zzJsBPAIQ_sA&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fdocs.julialang.org%2Fen%2Frelease-0.5%2Fmanual%2Fperformance-tips%2F\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEW9yEVFj7as4ANI2zzJsBPAIQ_sA&#39;;return true;">http://docs.julialang.org/en/release-0.5/manual/performance-tips/

On Fri, 2016-11-18 at 11:22, Nicholas Mueschke <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="bw19y5soCAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">nmue...@...> wrote:

> To start, I'm new to Julia and I'm trying things out to test Julia out for
> some scientific/engineering applications.  In particular, I'm working on
> moderate size projects, where they're big enough that I'll need more than
> one file with code in it to stay organized (let's say anywhere from 5-50
> files, 10s-100s of functions.  However, I'm struggling to figure out an
> appropriate way to organize my code and ensure that the proper variables
> are in scope where they are needed.
>
> To start, I come from a Matlab/C#/C++/Fortran/Basic/Pascal/etc. background
> and have been coding for long time, so I'm a little baffled by Julia's
> structure.
>
> Here's the basic description of my problem.  I've got a collection of
> (sometimes large) 1D arrays.  I'll define some starting values for these
> arrays and then I simply iterate on them and update the values in the
> arrays (basically I'm solving unsteady PDE problems).  A very simple
> program structure would look something like this (thinking in terms of a
> functional programming approach in Matlab/C/Fortran,):
>
>
> --------------------------------------------------------------------------------------------
> module MyProjectModule
>
> # Include some files that have functions I need
> include("SetSomeProblemParameters.jl");
> include("DeclareASetOfArrays.jl");
> include("AssignInitialValuesToArrays.jl");
> include("CheckSomeValuesInSomeArrays.jl");
> include("CalculateSomeValuesBasedUponArrayValues.jl");
> include("UpdateValuesOfArrays.jl");
> include("WriteResultsToDisk.jl");
>
> # Make main() visible
> export main
>
>
> main(NumberOfIterations)  # This is the main entry point of the code that
> performs a lengthy numerical calculation
>
> SetSomeProblemParameters()
> DeclareASetOfArrays()
> AssignInitialValuesToArrays()
>
> while (NumberOfIterations not reached)
>   CheckSomeValuesInSomeArrays()
>   CalculateSomeValuesBasedUponArrayValues()
>   UpdateValuesOfArrays()
>
>   NumberOfIterations++
> end #while
>
> WriteResultsToDisk()
>
> end #main
>
> end MyProjectModule
> --------------------------------------------------------------------------------------------
>
>
> Simple, right?
>
> Question #1:  If main entry point to run a calculation "main()" is a
> function, it gets its own variable workspace, right?  Now, if I write a
> script (not a function) and use include("some_script.jl") with main(), does
> Julia just inline that code within main()?  In terms of scope, should the
> script file be able to see all of the variables in the scope of main()?  In
> Matlab that would be true.  In Fortran/C that wouldn't.  I guess, I'm not
> sure what scope implications there are for Julia script files.
>
> Question #2:  If I've defined a bunch of functions as shown in the
> pseudocode above, what is the most performant way to have the large 1D
> arrays accessible within the scope of each function.  As you can tell, I'm
> trying to avoid writing functions that accept a long list of input
> parameters.  The old Fortran solution is to simply make the arrays global,
> so that each function can access them as needed.  How terrible is that idea
> within the Julia framework?  Also, how can I even do that?  I've tried
> writing a script (not a function) to declare a long list of global
> variables and then used include("DeclareGlobalVariables,jl) within my main.
>  But, when I return to main(), those variables do not show up in the
> workspace for main???  What am I missing?
>
> Question #3: I come from a VisualStudio IDE background, so I'm having
> trouble figuring out how to organize a Juila project.  I'm trying out Atom
> for my first Julia tests.  For a project that's bigger than just a script
> or a few functions, should I be defining a defining main entry point
> function within a module?  Why Does Julia force modules to be added as
> packages so they can be loaded with the "using" command?  That seems
> strange.  Or, should I just write everything as a collection of files with
> functions in them and not worry about modules?  Simple REPL and one file
> Julia examples are everywhere.  There are also large coding
> projects/libraries/utilities on github as examples, but I'm having trouble
> figuring out the structure of these larger projects.  I guess, I'm
> somewhere in between these two cases, where I'm just want to crunch some
> numbers, but I'm a little more complicated/sophisticated than the single
> file examples.  What's the best way to proceed with such a project/file
> structure?
>
> Thanks in advance for any help.
>
> Nick
Loading...