Techniques for Customizing Graph Axes

Grinding axes:
Axis scales, labels and ticks
Nicholas J. Cox
Department of Geography
Aims  
                                                                      Axis axis
,
                                                                  the chital or spotted deer
This is a round-up of some technique
for graph axes, ranging from some
simple tricks to some community-
contributed commands
, both old and
new.
Code to reproduce all graphs will be
posted after the meeting.
2
Commands from SSC and 
Stata Journal
nicelabels
 
SSC and 
Stata Journal 
22(4) in press 
n
iceloglabels 
Stata Journal 
18(1): 262—286 and 20(3): 1028 
m
ylabels
 
and
 myticks
 
SSC and 
Stata Journal 
22(4) in press 
qplot 
latest 
Stata Journal 
19(3): 748
d
istplot 
latest 
Stata Journal 
19(1): 260
t
ransplot
 
SSC
3
Once stated, often applied
A common – perhaps increasingly common – need is for
a series of graphs produced by a loop or other repetition
to have a pre-stated consistent style
.
The grey area is where 
graph
 will make decisions for you
that turn out to be what you don’t want.
So, you may need to spell out your desires more explicitly.
4
Ever needed
a tick to be suppressed?
labels between ticks, not at them?
minor nudging of labels?
a logarithmic scale but found default labels undesirable?
automatic choice of ‘nice’ labels that is under your control?
a slightly non-standard scale such as logit, reciprocal or root?
5
Suppressing ticks
Axis ticks are like marks on a ruler showing a graduated scale.
You might want to suppress them, particularly if your scale is categorical,
not quantitative.
The sub-option 
noticks
 is the obvious thing to try, but it doesn’t always
work or may not be quite right.
Other tricks are to set 
tlcolor(bg) 
or  
tlcolor(none)
or to adjust 
tlength()
, e.g. to zero.
A tick one can’t see is in effect not present.
6
 
The ticks on the 
x
 axis are not
needed, as the scale is categorical.
7
 
Here we set 
tlc(bg) tlength(2)
to suppress the tick but also to keep
the label text at a modest distance
from the axis.
We also stretched the axis with
xsc()
.
We also changed the alignment, to
be explained next.
8
Aligning labels and ticks
 
Usually label text is centred on the corresponding tick.
If you want text to start or end at the tick, use a small angle, say
ang(-0.0001)
 for left justification or
ang(0.0001)
 for right justification.
You may need to tweak other settings.
Vince Wiggins taught me this trick.  See also our paper at
https://www.stata-journal.com/article.html?article=gr0079
9
Labels may refer to intervals, not points
Sometimes a text label refers to an interval, not a point.
Consider time series.
A time series extending over say 100 years is usually treated as a series of
points and we don’t try to label each year on the time axis.
A time series extending over say 10 years or less is one where a different
approach often helps, namely
 labels without ticks in the middle of each interval
 big ticks at the ends of each interval.
10
11
12
Christmas is coming, or showing seasonal detail
A small bonus of this trick – using grid lines too – is that we get to see more
clearly that turkey sales are usually highest in the 4
th
 quarter.
More generally, detail on seasonality can be important, or at least
interesting.
This was written up at
https://www.stata-journal.com/article.html?article=gr0030
13
A pet peeve (I have others)
 
I often see 
time
 or 
year
 as 
x
 axis title.
Who needs that?  It can be cut without loss.
In your past there was some teacher (of physics???) who was savage if you
did not give precise axis titles.
That teacher was right – except in this case.
14
Nudging axis labels slightly
On graphs with multiple panels, labels can get unfortunately close,
or even overlap.
You can increase the separation of panels, which may be wasteful of space,
or just add spaces on the fly to nudge the end labels inwards.
For example:
Xla(1935 “ 1935" 1955 "1955 " 1940(5)1950)
Who introduced 
a(b)c
 notation, and when? The first use known to me was by J.W. Tukey in 1948.
15
16
17
Labels on logarithmic scale
That example leads naturally to the question of plotting on logarithmic scale.
graph 
doesn’t an especially good job of automating “nice” labels on
logarithmic scale.
This is what you get by asking for the previous graph to be shown with
ysc(log)
.
18
19
niceloglabels
A generous interpretation is that Stata is saying “You should know what you
want here, so back to you to tell me”.
A few failed attempts at better code indicated that there isn’t a solution –
except what people prefer as their style and that works well for their data.
Something better is offered by 
niceloglabels
, which
suggests nice log labels given a style choice
puts their specification into a local macro for later use.
20
So what are “nice” log labels?
As in the rest of life, “nice” can be hard to define precisely,
but easier to recognise in practice.
niceloglabels 
suggests 
 
labels depending on a range and
a preferred style. So,
style 1 is for labels that are powers of 10
style 2 is for powers of 2
style 13 is for sequences like 1, 3, 10, 30, 100
style 125 is for sequences like 1, 2, 5, 10, 20, 50, 100
And there are others.
21
 
Yet more: you can specify that you want to see
powers with superscripts like 10
6 
 or 10
-9  
or
unit fractions such as 1/10 or 1/16.
22
23
nicelabels
nicelabels
 
came after 
niceloglabels
, as it is needed less often.
It extends James Hardin’s 
nicenum 
from 
Stata Technical Bulletin
25: 2–3 (1995).
In essence, 1, 2 and 5 times powers of 10 are nice.  1, 20 and 500 are nice.
Given a range (equivalently a variable with its range),  
nicelabels 
can
suggest tight labels (within the range) or loose labels (wider than the range).
It can be used together with other preferences such as always showing zero
or always showing the observed minimum and maximum
.
24
 
Being nice isn’t everything
Multiples of 0 25 50 75 100 might be exactly what you need for
labelling extremes, median and quartiles or benchmarks on any
percent scale.
For hours of the day, 0(3)24 or 0(6)24 could be good.
For map directions as compass bearings, 0(45)360° or 0(90)360°
could be good (noting that 0 
 
360°).
25
 
. numlist "22652(14)22820“
. di "`r(numlist)'"
22652 22666 22680 22694 22708 22722 22736 22750
22764 22778 22792 22806 22820
may look horrible, but the list denotes every other Friday in the
first half of 2022 and could be the basis for acceptable date labels.
26
nicelabels 
with a numeric range
. nicelabels 142 233, local(foo)
step:      20
labels:    140 160 180 200 220 240
. nicelabels 142 233, local(foo) tight
step:      20
labels:    160 180 200 220
. nicelabels 142 233, local(foo) nvals(10)
step:      10
labels:    140 150 160 170 180 190 200 210 220 230 240
. Nicelabels 142 233, local(foo) nvals(10) tight
labels:    150 160 170 180 190 200 210 220 230
27
nicelabels 
with a numeric variable (
census.dta
)
. nicelabels medage, local(agela)
step:      5
labels:    20 25 30 35
. nicelabels medage, local(agela) tight
step:      5
labels:    25 30
. nicelabels medage, local(agela) nvals(10)
step:      2
labels:    24 26 28 30 32 34 36
. nicelabels medage, local(agela) nvals(10) tight
step:      2
labels:    26 28 30 32 34
28
Axis labels must start at zero?
A variable is all positive, but regardless you want to insist on labels starting at zero:
. sysuse auto, clear
(1978 automobile data)
. summarize mpg, meanonly
.
. nicelabels 0 `r(max)', local(foo)
step:      10
labels:    0 10 20 30 40 50
29
Observed minimum and maximum should be labels?
This mix isn't guaranteed to be nice!
. nicelabels mpg, tight local(yla)
step:      10
labels:    20 30 40
. summarize mpg, meanonly
. local yla `yla' `r(min)' `r(max)’
(
similar code for 
weight)
. scatter mpg weight, xla(`xla') yla(`yla', ang(h)) ms(Oh)
30
Want at least 5 labels?
You can count the number suggested and tell 
nicelabels
 to try
again if it does not suggest enough.
Some degree of automation may be important to some users.
. nicelabels mpg, tight local(yla)
step:      10
labels:    20 30 40
. if wordcount("`yla’”) < 5 nicelabels mpg, tight local(yla) nvals(10)
step:      5
labels:    15 20 25 30 35 40
31
m
ylabels
 
and
 myticks 
m
ylabels 
was written to support use of any transformed scale whatsoever.
Hence, values are plotted on one scale, but the labels you want to see are on
another scale, usually that of the original data.
The main idea was to support transformations other than logarithm,
which 
often 
is supported directly by 
ysc(log) 
and 
xsc(log)
.
You need to specify your scale using  
@
, which imparts some flexibility. 
The inspiration was given by Patrick Royston in 
Stata Technical Bulletin
34: 9-10 (1996).
32
Some useful transformed scales
Square root
Cube root
Reciprocal
Logit
Folded root = 
sqrt(p) – sqrt(1 – p)
Neglog = 
sign() * ln(1 + abs())
Inverse sinh = 
asinh() 
and inverse tanh = 
atanh()
… to name only a magnificent eight … and not yet naming any quantile scales
33
The implication is that we need generality and flexibility
Let’s analyse 
mpg 
from the auto data in terms of its reciprocal but show
labels in terms of miles per gallon for easier interpretation.
sysuse auto, clear
set scheme s1color
* factor of 1000 is for convenience in regression
gen gpm = 1000/mpg
regress gpm weight
mylabels 12 15(5)35 41, myscale(1000/@) local(yla)
scatter gpm weight, ms(Oh) mc(blue) yla(`yla', ang(h))
ytitle(Miles per gallon (reciprocal scale))
34
35
myticks
 
too is available
 
myticks 12/41, myscale(1000/@) local(myyti)
Transformations can be motivated in terms of where they stretch and where
they squeeze, relatively speaking.
The pattern of axis ticks can make this vivid.
36
@
 indicates the desired scale
mylabels 12 15(5)35 41, myscale(1000/@) local(yla)
I want to see labels with text 12 15(5)35 41.
The scale used is 1000/that.
So where you have value 1000/12 (work it out!), show text 
 12
.
And so on.
37
Use Stata syntax for transformation
You can use standard Stata syntax to indicate the transformation, including
function calls.
sqrt(@)
sign(@) * ln(1 + abs(@)) 
OR 
  
sign(@) * log1p(abs(@))
… or whatever else you want.
38
Celsius from Fahrenheit
webuse citytemp, clear
summarize
scatter tempjuly tempjan
mylabels 10(5)35, myscale(32 + (9/5)* @) local(myyla)
mylabels -15(5)20, myscale(32 + (9/5)* @) local(myxla)
scatter tempjuly tempjan, ms(Oh) mc(blue) xli(32, lc(gs8)) ///
yla(`myyla', ang(h)) xla(`myxla’) ///
ytitle(Average July temperature ({&degree}C)) ///
xtitle(Average January temperature ({&degree}C))
39
40
Use all axes to show dual scales?
scatter tempjuly tempjan, ms(Oh) mc(blue) yaxis(1 2) xaxis(1 2) ///
xli(32, lc(gs8)) ///
yla(`myyla', ang(h) axis(1)) xla(`myxla', axis(1)) ///
yla(50(9)95, axis(2) ang(h)) xla(5(9)68, axis(2) grid) ///
ytitle(Average July temperature ({&degree}C), axis(1)) ///
ytitle(Average July temperature ({&degree}F), axis(2)) ///
xtitle(Average January temperature ({&degree}C), axis(1)) ///
xtitle(Average January temperature ({&degree}F), axis(2))
41
42
Yet more
mylabels 
has 
prefix() 
and 
suffix() 
options to add text to each axis
label
– such as % signs, currency symbols, or units of measurement
– except that 
firstonly
 and 
lastonly
 options specify adding them only
to the first or last label on that axis.
43
Festina lente
mylabels 
was first posted to SSC in 2003
https://www.stata.com/statalist/archive/2003-05/msg00084.html
and is written up in 
Stata Journal 
22(4) in press.
Some projects move more slowly than others….
44
Other uses of the 
@
 syntax for flexible scales
qplot 
is a general purpose quantile plot command that goes back to 
Stata
Technical Bulletin 
51: 16—18 (1999).
The latest version is at 
Stata Journal 
19: 748 (2019).
A 
trscale() 
option allows cumulative probabilities to be mapped to some
other scale, e.g. normal or Gaussian standard deviates.
Similar comments apply to 
distplot
 for distribution function plots. 
45
Words from the wise
It can be useful to plot an observed distribution against the standard
Gaussian even though there is no question of it being Gaussian in
shape. The motive is that it is easier to study a distribution by
comparing it with a standard shape than just by looking at it.
Michael Hills (1934—2021). 
Statistics for Comparative Studies
.
London: Chapman and Hall, p.28 (1974)
The normal QQ-plot is a useful exploratory tool even for nonnormal
data.  The plot shows skewness, heavy-tailed or short-tailed
behaviour, digit preference, or outliers and other unusual values.
Yudi Pawitan (1960—). 
In All Likelihood
. Oxford: Oxford University
Press, p.92 (2001)
46
47
transplot
t
ransplot
 
(SSC)  supports various plots in which one or both variables
may be on transformed scales.
https://www.statalist.org/forums/forum/general-stata-
discussion/general/1561836-transplot-package-downloadable-from-ssc
gives a quick overview.
https://www.stata.com/meeting/uk19/slides/uk19_cox.pptx
 talked about
t
ransplot
 
earlier when under development (slides 31 to 48). 
48
Rules in 
transplot
, more or less
0.   If no transform is mentioned, use that variable as it comes.
1.
If 
@
 is specified, use that variable as it comes.
2.
If a Stata function is mentioned, apply that function to a variable: e.g.
log10
 means 
log10(@)
3.
Otherwise apply the expression given: e.g. 
sqrt(@) – sqrt(1 - @)
4.
(Undocumented!) Try the code as a call to an 
egen
 function.
These rules are a fairly elaborate implementation of flexible scaling, and so
the corresponding code may interest Stata user-programmers
.
49
 
webuse grunfeld, clear
transplot qnorm invest mvalue kstock, trans(@ log10)
ms(Oh) mc(blue)
transplot qnorm invest mvalue kstock, trans(@ log10)
ms(Oh) mc(blue) combine(colfirst)
transplot qnorm invest mvalue kstock, trans(@ log10)
combine(colfirst) recast(line) lc(blue) lw(medthick)
50
51
52
53
In short
What 
transplot 
does typically is fire up a portfolio of graphs for all
variables and transformations mentioned and then show them using
graph combine
. (The identity transformation is allowed.)
We can reach through and add options of 
graph combine
, in this
case to re-order the graphs.
We can as usual reach through and tune the options of the graph
command used,  in this case 
qnorm
 to 
recast(line)
.
In this example, the conclusion is mundane: we would be better off for
all variables working on logarithmic scale, as is clear on other grounds.
54
 
References and some other
notes follow this slide.
The code will be posted after
the talk.
55
Stata Journal 
references  (covering further topics too)
2022 Automating axis labels: Nice numbers and transformed scales. 
SJ
 22 in press
2021 Adding marginal spike histograms to quantile and cumulative distribution plots. 
SJ 
21: 838-846
[discusses square root scales]
2021 Shorter or fewer labels with graph bar? 
SJ
 21: 263-271
2019 Quantile plots, generalized. 
SJ 
19: 748 [previous versions back to 1999]
2019 Distribution function plots. 
SJ 
19: 260 [previous versions back to 1999]
2019 (N.J. Cox and V. Wiggins) Tiny tricks and tips on ticks
. SJ 
19: 741-747
2018 Logarithmic binning and labeling
. SJ 
18: 262-286 (Update 20: 1028)
2012 Axis practice, or what goes where on a graph. 
SJ 
12: 549-561 (Update 19: 748)
2012 Transforming the time axis. 
SJ 
12: 332-341
2008 Between graphs and tables. 
SJ
 8: 269-289
2008 Plotting on any transformed scale. 
SJ
 8: 142-145
2007 Better axis labeling for time points and time intervals. 
SJ
 7: 590-592
2005 Axis labels on two or more levels. 
SJ
 5: 469
56
 
All graphs use Stata scheme 
s1color
, 
which I strongly recommend as a lazy
but good default.
This font is Georgia.
This font is Lucida Console.
57
Slide Note
Embed
Share

Explore various methods to modify and enhance graph axes, from suppressing ticks to adjusting labels and scales. Learn how to control the appearance and behavior of axes for more visually appealing and informative graphs.

  • Graph
  • Axes
  • Customization
  • Techniques
  • Data Visualization

Uploaded on Mar 08, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Grinding axes: Axis scales, labels and ticks Nicholas J. Cox Department of Geography

  2. Aims Axis axis, the chital or spotted deer This is a round-up of some technique for graph axes, ranging from some simple tricks to some community- contributed commands, both old and new. Code to reproduce all graphs will be posted after the meeting. 2

  3. Commands from SSC and Stata Journal nicelabels SSC and Stata Journal 22(4) in press niceloglabels Stata Journal 18(1): 262 286 and 20(3): 1028 mylabels and myticks SSC and Stata Journal 22(4) in press qplot latest Stata Journal 19(3): 748 distplot latest Stata Journal 19(1): 260 transplot SSC 3

  4. Once stated, often applied A common perhaps increasingly common need is for a series of graphs produced by a loop or other repetition to have a pre-stated consistent style. The grey area is where graph will make decisions for you that turn out to be what you don t want. So, you may need to spell out your desires more explicitly. 4

  5. Ever needed a tick to be suppressed? labels between ticks, not at them? minor nudging of labels? a logarithmic scale but found default labels undesirable? automatic choice of nice labels that is under your control? a slightly non-standard scale such as logit, reciprocal or root? 5

  6. Suppressing ticks Axis ticks are like marks on a ruler showing a graduated scale. You might want to suppress them, particularly if your scale is categorical, not quantitative. The sub-option noticksis the obvious thing to try, but it doesn t always work or may not be quite right. Other tricks are to set tlcolor(bg) or tlcolor(none) or to adjust tlength(), e.g. to zero. A tick one can t see is in effect not present. 6

  7. The ticks on the x axis are not needed, as the scale is categorical. 7

  8. Here we set tlc(bg) tlength(2) to suppress the tick but also to keep the label text at a modest distance from the axis. We also stretched the axis with xsc(). We also changed the alignment, to be explained next. 8

  9. Aligning labels and ticks Usually label text is centred on the corresponding tick. If you want text to start or end at the tick, use a small angle, say ang(-0.0001) for left justification or ang(0.0001) for right justification. You may need to tweak other settings. Vince Wiggins taught me this trick. See also our paper at https://www.stata-journal.com/article.html?article=gr0079 9

  10. Labels may refer to intervals, not points Sometimes a text label refers to an interval, not a point. Consider time series. A time series extending over say 100 years is usually treated as a series of points and we don t try to label each year on the time axis. A time series extending over say 10 years or less is one where a different approach often helps, namely labels without ticks in the middle of each interval big ticks at the ends of each interval. 10

  11. 11

  12. 12

  13. Christmas is coming, or showing seasonal detail A small bonus of this trick using grid lines too is that we get to see more clearly that turkey sales are usually highest in the 4th quarter. More generally, detail on seasonality can be important, or at least interesting. This was written up at https://www.stata-journal.com/article.html?article=gr0030 13

  14. A pet peeve (I have others) I often see time or year as x axis title. Who needs that? It can be cut without loss. In your past there was some teacher (of physics???) who was savage if you did not give precise axis titles. That teacher was right except in this case. 14

  15. Nudging axis labels slightly On graphs with multiple panels, labels can get unfortunately close, or even overlap. You can increase the separation of panels, which may be wasteful of space, or just add spaces on the fly to nudge the end labels inwards. For example: Xla(1935 1935" 1955 "1955 " 1940(5)1950) Who introduced a(b)c notation, and when? The first use known to me was by J.W. Tukey in 1948. 15

  16. 16

  17. 17

  18. Labels on logarithmic scale That example leads naturally to the question of plotting on logarithmic scale. graph doesn t an especially good job of automating nice labels on logarithmic scale. This is what you get by asking for the previous graph to be shown with ysc(log). 18

  19. 19

  20. niceloglabels A generous interpretation is that Stata is saying You should know what you want here, so back to you to tell me . A few failed attempts at better code indicated that there isn t a solution except what people prefer as their style and that works well for their data. Something better is offered by niceloglabels, which suggests nice log labels given a style choice puts their specification into a local macro for later use. 20

  21. So what are nice log labels? As in the rest of life, nice can be hard to define precisely, but easier to recognise in practice. niceloglabels suggests labels depending on a range and a preferred style. So, style 1 is for labels that are powers of 10 style 2 is for powers of 2 style 13 is for sequences like 1, 3, 10, 30, 100 style 125 is for sequences like 1, 2, 5, 10, 20, 50, 100 And there are others. 21

  22. Yet more: you can specify that you want to see powers with superscripts like 106 or 10-9 or unit fractions such as 1/10 or 1/16. 22

  23. 23

  24. nicelabels nicelabels came after niceloglabels, as it is needed less often. It extends James Hardin s nicenum from Stata Technical Bulletin 25: 2 3 (1995). In essence, 1, 2 and 5 times powers of 10 are nice. 1, 20 and 500 are nice. Given a range (equivalently a variable with its range), nicelabels can suggest tight labels (within the range) or loose labels (wider than the range). It can be used together with other preferences such as always showing zero or always showing the observed minimum and maximum. 24

  25. Being nice isnt everything Multiples of 0 25 50 75 100 might be exactly what you need for labelling extremes, median and quartiles or benchmarks on any percent scale. For hours of the day, 0(3)24 or 0(6)24 could be good. For map directions as compass bearings, 0(45)360 or 0(90)360 could be good (noting that 0 360 ). 25

  26. . numlist "22652(14)22820 . di "`r(numlist)'" 22652 22666 22680 22694 22708 22722 22736 22750 22764 22778 22792 22806 22820 may look horrible, but the list denotes every other Friday in the first half of 2022 and could be the basis for acceptable date labels. 26

  27. nicelabels with a numeric range . nicelabels 142 233, local(foo) step: 20 labels: 140 160 180 200 220 240 . nicelabels 142 233, local(foo) tight step: 20 labels: 160 180 200 220 . nicelabels 142 233, local(foo) nvals(10) step: 10 labels: 140 150 160 170 180 190 200 210 220 230 240 . Nicelabels 142 233, local(foo) nvals(10) tight labels: 150 160 170 180 190 200 210 220 230 27

  28. nicelabels with a numeric variable (census.dta) . nicelabels medage, local(agela) step: 5 labels: 20 25 30 35 . nicelabels medage, local(agela) tight step: 5 labels: 25 30 . nicelabels medage, local(agela) nvals(10) step: 2 labels: 24 26 28 30 32 34 36 . nicelabels medage, local(agela) nvals(10) tight step: 2 labels: 26 28 30 32 34 28

  29. Axis labels must start at zero? A variable is all positive, but regardless you want to insist on labels starting at zero: . sysuse auto, clear (1978 automobile data) . summarize mpg, meanonly . . nicelabels 0 `r(max)', local(foo) step: 10 labels: 0 10 20 30 40 50 29

  30. Observed minimum and maximum should be labels? This mix isn't guaranteed to be nice! . nicelabels mpg, tight local(yla) step: 10 labels: 20 30 40 . summarize mpg, meanonly . local yla `yla' `r(min)' `r(max) (similar code for weight) . scatter mpg weight, xla(`xla') yla(`yla', ang(h)) ms(Oh) 30

  31. Want at least 5 labels? You can count the number suggested and tell nicelabels to try again if it does not suggest enough. Some degree of automation may be important to some users. . nicelabels mpg, tight local(yla) step: 10 labels: 20 30 40 . if wordcount("`yla ) < 5 nicelabels mpg, tight local(yla) nvals(10) step: 5 labels: 15 20 25 30 35 40 31

  32. mylabels and myticks mylabels was written to support use of any transformed scale whatsoever. Hence, values are plotted on one scale, but the labels you want to see are on another scale, usually that of the original data. The main idea was to support transformations other than logarithm, which often is supported directly by ysc(log) and xsc(log). You need to specify your scale using @, which imparts some flexibility. The inspiration was given by Patrick Royston in Stata Technical Bulletin 34: 9-10 (1996). 32

  33. Some useful transformed scales Square root Cube root Reciprocal Logit Folded root = sqrt(p) sqrt(1 p) Neglog = sign() * ln(1 + abs()) Inverse sinh = asinh() and inverse tanh = atanh() to name only a magnificent eight and not yet naming any quantile scales 33

  34. The implication is that we need generality and flexibility Let s analyse mpg from the auto data in terms of its reciprocal but show labels in terms of miles per gallon for easier interpretation. sysuse auto, clear set scheme s1color * factor of 1000 is for convenience in regression gen gpm = 1000/mpg regress gpm weight mylabels 12 15(5)35 41, myscale(1000/@) local(yla) scatter gpm weight, ms(Oh) mc(blue) yla(`yla', ang(h)) ytitle(Miles per gallon (reciprocal scale)) 34

  35. 35

  36. mytickstoo is available myticks 12/41, myscale(1000/@) local(myyti) Transformations can be motivated in terms of where they stretch and where they squeeze, relatively speaking. The pattern of axis ticks can make this vivid. 36

  37. @ indicates the desired scale mylabels 12 15(5)35 41, myscale(1000/@) local(yla) I want to see labels with text 12 15(5)35 41. The scale used is 1000/that. So where you have value 1000/12 (work it out!), show text 12. And so on. 37

  38. Use Stata syntax for transformation You can use standard Stata syntax to indicate the transformation, including function calls. sqrt(@) sign(@) * ln(1 + abs(@)) OR sign(@) * log1p(abs(@)) or whatever else you want. 38

  39. Celsius from Fahrenheit webuse citytemp, clear summarize scatter tempjuly tempjan mylabels 10(5)35, myscale(32 + (9/5)* @) local(myyla) mylabels -15(5)20, myscale(32 + (9/5)* @) local(myxla) scatter tempjuly tempjan, ms(Oh) mc(blue) xli(32, lc(gs8)) /// yla(`myyla', ang(h)) xla(`myxla ) /// ytitle(Average July temperature ({&degree}C)) /// xtitle(Average January temperature ({&degree}C)) 39

  40. 40

  41. Use all axes to show dual scales? scatter tempjuly tempjan, ms(Oh) mc(blue) yaxis(1 2) xaxis(1 2) /// xli(32, lc(gs8)) /// yla(`myyla', ang(h) axis(1)) xla(`myxla', axis(1)) /// yla(50(9)95, axis(2) ang(h)) xla(5(9)68, axis(2) grid) /// ytitle(Average July temperature ({&degree}C), axis(1)) /// ytitle(Average July temperature ({&degree}F), axis(2)) /// xtitle(Average January temperature ({&degree}C), axis(1)) /// xtitle(Average January temperature ({&degree}F), axis(2)) 41

  42. 42

  43. Yet more mylabels has prefix() and suffix() options to add text to each axis label such as % signs, currency symbols, or units of measurement except that firstonly and lastonly options specify adding them only to the first or last label on that axis. 43

  44. Festina lente mylabels was first posted to SSC in 2003 https://www.stata.com/statalist/archive/2003-05/msg00084.html and is written up in Stata Journal 22(4) in press. Some projects move more slowly than others . 44

  45. Other uses of the @ syntax for flexible scales qplot is a general purpose quantile plot command that goes back to Stata Technical Bulletin 51: 16 18 (1999). The latest version is at Stata Journal 19: 748 (2019). A trscale() option allows cumulative probabilities to be mapped to some other scale, e.g. normal or Gaussian standard deviates. Similar comments apply to distplot for distribution function plots. 45

  46. Words from the wise It can be useful to plot an observed distribution against the standard Gaussian even though there is no question of it being Gaussian in shape. The motive is that it is easier to study a distribution by comparing it with a standard shape than just by looking at it. Michael Hills (1934 2021). Statistics for Comparative Studies. London: Chapman and Hall, p.28 (1974) The normal QQ-plot is a useful exploratory tool even for nonnormal data. The plot shows skewness, heavy-tailed or short-tailed behaviour, digit preference, or outliers and other unusual values. Yudi Pawitan (1960 ). In All Likelihood. Oxford: Oxford University Press, p.92 (2001) 46

  47. 47

  48. transplot transplot (SSC) supports various plots in which one or both variables may be on transformed scales. https://www.statalist.org/forums/forum/general-stata- discussion/general/1561836-transplot-package-downloadable-from-ssc gives a quick overview. https://www.stata.com/meeting/uk19/slides/uk19_cox.pptx talked about transplot earlier when under development (slides 31 to 48). 48

  49. Rules in transplot, more or less 0. If no transform is mentioned, use that variable as it comes. 1. If @ is specified, use that variable as it comes. 2. If a Stata function is mentioned, apply that function to a variable: e.g. log10 means log10(@) 3. Otherwise apply the expression given: e.g. sqrt(@) sqrt(1 - @) 4. (Undocumented!) Try the code as a call to an egen function. These rules are a fairly elaborate implementation of flexible scaling, and so the corresponding code may interest Stata user-programmers. 49

  50. webuse grunfeld, clear transplot qnorm invest mvalue kstock, trans(@ log10) ms(Oh) mc(blue) transplot qnorm invest mvalue kstock, trans(@ log10) ms(Oh) mc(blue) combine(colfirst) transplot qnorm invest mvalue kstock, trans(@ log10) combine(colfirst) recast(line) lc(blue) lw(medthick) 50

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#