Linux Format - UK (2020-03)

(Antfer) #1
http://www.techradar.com/pro/linux March 2020 LXF260 93

Dates & time CODING ACADEMY


Figure 3: A first
try to create
a calendar for
January 2020
using ggcal and
ggplot2. Although
the output is not
very sophisticated,
it does its job
pretty well.

machine and the Docker image that will enable the
running container to see the R scripts that we are going
to create without any extra steps. Alternatively, you can
copy and paste the R code into RStudio.
Although you can do the same using additional
parameters in the docker run command, the docker-
compose.yml configuration file makes things much
simpler – the contents of which will be the following:
version: ‘2’

services:
r:
image: rocker/rstudio:latest
container_name: r-project
restart: always
environment:


  • PASSWORD=LXF
    ports:

  • 8787:8787


volumes:


  • $HOME/Public/r/scripts:/data
    Please store docker-compose.yml in its own
    directory for reasons of simplicity and efficiency. After
    that you can start the Docker image by running docker-
    compose up and stop it by running docker-compose
    down from inside that directory. The RStudio username
    is rstudio whereas the password is what you put in the
    PASSWORD field in docker-compose.yml – in this case
    LXF. The volumes part allows us to associate a directory
    on the local machine ($HOME/Public/r/scripts) with a
    directory in the running container (/data). Everything
    you put in $HOME/Public/r/scripts will be visible in the
    running container. Lastly, the port number to connect to
    the web interface of RStudio will be 8787 – this is
    defined in the Ports section of docker-compose.yml.
    Note that transferring that docker-compose.yml to
    any other computer with any operating system that
    supports Docker will create exactly the same
    environment for you to work in – the only thing that
    might change is the path of the shared volume, which is
    an optional parameter. You can connect to the Docker
    image by executing docker exec -it r-project bash.


Hot dates in R
R offers the built-in Date, POSIXlt and POSIXct classes
for storing dates and times, as well as the zoo and chron
packages. This tutorial will only use the built-in types of
R. The general principle is that if you are using a non-
standard format, you will have to specify the format you
are using. Moreover, R allows you to do calculations with
dates, which can be very practical.
The POSIXlt (lt:local time) class allows you to easily
extract specific components out of a time because it
uses a list with separate vectors for storing the year,
month, day of week, day within the year, etc. It is
important to remember that POSIXlt objects, which are
lists, are not continuous variables.
POSIXct (ct: calendar time) is the best class when
you have times in your data; among other things, the
class enables you to specify the time zone of a date.
POSIXct keeps the epoch time, which means that it
holds the number of seconds passed since 00:00:00
UTC Thursday 1 January 1970, which also means that
POSIXct variables are continuous variables. In practice,

this means that if you want to make statistical
calculations that involve dates and times, using
POSIXct variables is the right choice.
If you only have dates in your data, then you should
use the Date class. Type help(DateTimeClasses) to get
more information about date and time classes.
Additionally, the strptime(), as.POSIXct() and
as.POSIXlt() functions allow you to convert a factor or a
string into a date – the user must provide a format
statement in double quotes to inform R about the
structure of the input. The difftime() function can also
help you find the difference between two dates.
It is very important to use the correct code (format
component) when parsing dates and times in R: %Y is
for four-digit years, whereas %y is for two-digits years.
Use %d to declare the day of the month, and use %m for
declaring the month as a decimal number. Use %B as
the code for the full name of a month, and then %b for
the abbreviated name of a month. It is both necessary
and wise to try things using small samples before
working with real data, especially when dealing with
dates and times, where it is easy to make small typos
that can create big errors.
It is now time to see some of the aforementioned
functions in action. Firstly, we will begin by presenting
some simple R commands that allow you to get the
current date and time:
> Sys.time()

If you are
looking for
something
simpler but
programmable
for creating
your own
calendars,
you can give
GraphViz
(http://
graphviz.org)
a try. However,
as calendar
creation is
not directly
supported by
Graphviz, you
will need to
write code.

R DATA TYPES


List – a generic vector containing other objects, and a vector is a
sequence of data elements of the same basic type.
Data frame – a list of vectors of equal length that is primarily used for
storing data tables. It is used a lot in R and is equivalent to the
concept of a table.
Array – a multi-dimensional object.
Factor – a mode numeric and class factor.
Matrix – a two-dimensional array that contains numeric data and has
rows and columns.
The mode() function returns the mode of an object, whereas the
class() function returns the class of an object – they can both be very
useful when you do not know the kind of R object you are dealing with.
The data types that you are going to use depend on the data that
you want to store. Usually, you do not write the data yourself; most of
the time you get your data from external sources, including text files,
databases and the internet. The reading of external data is made with
the help of the read.table() function that returns a data frame. Note
that there will be times when you will need to transform your data,
both external and internal, into the desired format.

9992March 0 h2rexplhepinsow March 2020 LXF260 93


Dates & time CODING ACADEMY


Figure 3: A first
try to create
a calendar for
January 2020
using ggcal and
ggplot2. Although
the output is not
very sophisticated,
it does its job
pretty well.

machine and the Docker image that will enable the
running container to see the R scripts that we are going
to create without any extra steps. Alternatively, you can
copy and paste the R code into RStudio.
Although you can do the same using additional
parameters in the docker run command, the docker-
compose.yml configuration file makes things much
simpler – the contents of which will be the following:
version: ‘2’


services:
r:
image: rocker/rstudio:latest
container_name: r-project
restart: always
environment:


  • PASSWORD=LXF
    ports:

  • 8787:8787


volumes:


  • $HOME/Public/r/scripts:/data
    Please store docker-compose.yml in its own
    directory for reasons of simplicity and efficiency. After
    that you can start the Docker image by running docker-
    compose up and stop it by running docker-compose
    down from inside that directory. The RStudio username
    is rstudio whereas the password is what you put in the
    PASSWORD field in docker-compose.yml – in this case
    LXF. The volumes part allows us to associate a directory
    on the local machine ($HOME/Public/r/scripts) with a
    directory in the running container (/data). Everything
    you put in $HOME/Public/r/scripts will be visible in the
    running container. Lastly, the port number to connect to
    the web interface of RStudio will be 8787 – this is
    defined in the Ports section of docker-compose.yml.
    Note that transferring that docker-compose.yml to
    any other computer with any operating system that
    supports Docker will create exactly the same
    environment for you to work in – the only thing that
    might change is the path of the shared volume, which is
    an optional parameter. You can connect to the Docker
    image by executing docker exec -it r-project bash.


Hot dates in R
R offers the built-in Date, POSIXlt and POSIXct classes
for storing dates and times, as well as the zoo and chron
packages. This tutorial will only use the built-in types of
R. The general principle is that if you are using a non-
standard format, you will have to specify the format you
are using. Moreover, R allows you to do calculations with
dates, which can be very practical.
The POSIXlt (lt:local time) class allows you to easily
extract specific components out of a time because it
uses a list with separate vectors for storing the year,
month, day of week, day within the year, etc. It is
important to remember that POSIXlt objects, which are
lists, are not continuous variables.
POSIXct (ct: calendar time) is the best class when
you have times in your data; among other things, the
class enables you to specify the time zone of a date.
POSIXct keeps the epoch time, which means that it
holds the number of seconds passed since 00:00:00
UTC Thursday 1 January 1970, which also means that
POSIXct variables are continuous variables. In practice,


this means that if you want to make statistical
calculations that involve dates and times, using
POSIXct variables is the right choice.
If you only have dates in your data, then you should
use the Date class. Type help(DateTimeClasses) to get
more information about date and time classes.
Additionally, the strptime(), as.POSIXct() and
as.POSIXlt() functions allow you to convert a factor or a
string into a date – the user must provide a format
statement in double quotes to inform R about the
structure of the input. The difftime() function can also
help you find the difference between two dates.
It is very important to use the correct code (format
component) when parsing dates and times in R: %Y is
for four-digit years, whereas %y is for two-digits years.
Use %d to declare the day of the month, and use %m for
declaring the month as a decimal number. Use %B as
the code for the full name of a month, and then %b for
the abbreviated name of a month. It is both necessary
and wise to try things using small samples before
working with real data, especially when dealing with
dates and times, where it is easy to make small typos
that can create big errors.
It is now time to see some of the aforementioned
functions in action. Firstly, we will begin by presenting
some simple R commands that allow you to get the
current date and time:
> Sys.time()

If youare
lookingfor
something
simplerbut
programmable
forcreating
yourown
calendars,
youcangive
GraphViz
(http://
graphviz.org)
a try.However,
ascalendar
creationis
notdirectly
supportedby
Graphviz,you
willneedto
writecode.

R DATATYPES


List – a generic vector containing other objects, and a vector is a
sequence of data elements of the same basic type.
Data frame – a list of vectors of equal length that is primarily used for
storing data tables. It is used a lot in R and is equivalent to the
concept of a table.
Array – a multi-dimensional object.
Factor – a mode numeric and class factor.
Matrix – a two-dimensional array that contains numeric data and has
rows and columns.
The mode() function returns the mode of an object, whereas the
class() function returns the class of an object – they can both be very
useful when you do not know the kind of R object you are dealing with.
The data types that you are going to use depend on the data that
you want to store. Usually, you do not write the data yourself; most of
the time you get your data from external sources, including text files,
databases and the internet. The reading of external data is made with
the help of the read.table() function that returns a data frame. Note
that there will be times when you will need to transform your data,
both external and internal, into the desired format.
Free download pdf