Ranking Refactoring Suggestions Based on Historical Volatility

Ranking Refactoring Suggestions
based on Historical Volatility
Nikolaos Tsantalis         Alexander Chatzigeorgiou
University of Macedonia
Thessaloniki, Greece
                            15th European Conference on Software Maintenance and Reengineering 
(
CSMR 
2011)
Design Problems
 
non-compliance with
design principles
 
excessive
metric values
 
lack of
design patterns
 
violations
of 
design
heuristics
 
Fowler’s
bad smells
Design Problems … can be numerous
Motivation
 
A
r
e
 
a
l
l
 
i
d
e
n
t
i
f
i
e
d
 
d
e
s
i
g
n
 
p
r
o
b
l
e
m
s
 
w
o
r
r
y
i
n
g
?
 Example: Why would it be urgent to improve a method suffering
from Long Method if the method had never been changed?
 
Need to define (quantify) the urgency to resolve a problem
 One possible source of information: 
Past code versions
Underlying Philosophy: 
code fragments that have been subject to
maintenance tasks in the past, are more likely to undergo changes
 refactorings involving the corresponding code should have a
higher priority.
Goal
 
 
To rank refactoring suggestions based on the urgency to resolve
the corresponding design problems
 The ranking mechanism should take into account:
 the number of past changes
 the extent of change
 the proximity to the current version
Inspiration
Forecasting in Financial Markets vs. Software
F
i
n
a
n
c
i
a
l
 
M
a
r
k
e
t
s
 Trends in volatility are more predictable than trends in prices
 Volatility is related to 
risk
 and 
general stability 
of markets
 defined as the relative rate at which prices move up and down
 time: trading days
 
S
o
f
t
w
a
r
e
 
 
P
r
e
v
e
n
t
i
v
e
 
M
a
i
n
t
e
n
a
n
c
e
 Risk lies in the decision to invest on resolving design problems
volatility based on changes involving code affected by a smell
 time: successive software versions
Code Smell Volatility
Forecasting Models
 
Examined Smells
 
 Detection tool: 
JDeodorant
 
 
Identified smells:
 
Long Method
 Feature Envy
 State Checking
Long Method
int i;
int product = 1;
for(i = 0; i < N; ++i) {
  product = product *i;
}
System.out.println(product);
Pieces of code with large size, high complexity and low cohesion
 
int i;
int sum = 0;
for(i = 0; i < N; ++i) {
  sum = sum + i;
}
System.out.println(sum);
What to look for
 
The presence of 
Long Method
 implies that it might be difficult
to maintain the method
 
p
e
r
f
o
r
m
 
r
e
f
a
c
t
o
r
i
n
g
 
i
f
 
w
e
 
e
x
p
e
c
t
 
t
h
a
t
 
t
h
e
 
i
n
t
e
n
s
i
t
y
 
o
f
 
t
h
e
s
m
e
l
l
 
w
i
l
l
 
c
h
a
n
g
e
P
r
e
v
i
o
u
s
 
v
e
r
s
i
o
n
s
:
 
d
e
t
e
c
t
 
c
h
a
n
g
e
s
 
i
n
 
t
h
e
 
i
m
p
l
e
m
e
n
t
a
t
i
o
n
 
o
f
 
t
h
e
m
e
t
h
o
d
 
t
h
a
t
 
a
f
f
e
c
t
 
t
h
e
 
i
n
t
e
n
s
i
t
y
 
o
f
 
t
h
e
 
s
m
e
l
l
 
c
h
a
n
g
e
Long Method
int i;
int sum = 0;
int product = 1;
for(i = 0; i < N; ++i) {
  sum = sum + i;
  product = product *i;
}
System.out.println(sum);
System.out.println(product);
int i;
int sum = 0;
int product = 1;
int sumEven = 0;
for(i = 0; i < N; ++i) {
  sum = sum + i;
  product = product *i;
  if(i%2 == 0)
    sumEven += i;
}
System.out.println(sum);
System.out.println(product);
System.out.println(sumEven);
V
e
r
s
i
o
n
 
i
V
e
r
s
i
o
n
 
i
+
1
E
x
t
e
n
d
 
o
f
 
C
h
a
n
g
e
:
 
n
u
m
b
e
r
o
f
 
e
d
i
t
 
o
p
e
r
a
t
i
o
n
s
 
t
o
 
c
o
n
v
e
r
t
m
e
t
h
o
d
i
 
t
o
 
m
e
t
h
o
d
i
+
1
Feature Envy
A method is “
more interested in a class other than the one it
actually is in
 
m(Target t) {
    t.m1();
    t.m2();
    t.m3();
}
 
m() {
    m1();
    m2();
    m3();
}
Feature Envy
T
h
e
 
I
n
t
e
n
s
i
t
y
 
o
f
 
t
h
e
 
s
m
e
l
l
 
i
s
 
r
e
l
a
t
e
d
 
t
o
 
t
h
e
 
n
u
m
b
e
r
 
o
f
 
e
n
v
i
e
d
m
e
m
b
e
r
s
m(Target t) {
    t.m1();
    t.m2();
    t.m3();
}
E
x
t
e
n
d
 
o
f
 
C
h
a
n
g
e
:
 
v
a
r
i
a
t
i
o
n
 
i
n
 
t
h
e
 
n
u
m
b
e
r
 
o
f
 
e
n
v
i
e
d
 
m
e
m
b
e
r
s
V
e
r
s
i
o
n
 
i
V
e
r
s
i
o
n
 
i
+
1
envies 3
members
 
m(Target t) {
    t.m1();
    t.m2();
    t.m3();
    t.m4();
}
envies 4
members
State Checking
State Checking manifests itself as conditional statements that
select an execution path based on the state of an object
 
 
  
doStateA();
 
switch(
type
) {
case 
STATE_A
:
break;
case 
STATE_B
:
break;
}
 
 
  
doStateB();
What to look for
 
State Checking
: implies a missed opportunity for polymorphism
 
if (state == StateA) {
    . . .
    . . .
}
else if (state == StateB)
{
   . . .
   . . .
}
 
else if (state == StateC) {
   . . .
   . . .
}
 
+
State Checking
T
h
e
 
i
n
t
e
n
s
i
t
y
 
o
f
 
t
h
e
 
s
m
e
l
l
 
i
s
 
p
r
i
m
a
r
i
l
y
 
r
e
l
a
t
e
d
 
t
o
 
t
h
e
 
n
u
m
b
e
r
 
o
f
c
o
n
d
i
t
i
o
n
a
l
 
s
t
r
u
c
t
u
r
e
s
 
c
h
e
c
k
i
n
g
 
o
n
 
t
h
e
 
s
a
m
e
 
s
t
a
t
e
s
V
e
r
s
i
o
n
 
i
V
e
r
s
i
o
n
 
i
+
1
1 cond.
structure
2 cond.
structures
E
x
t
e
n
d
 
o
f
 
C
h
a
n
g
e
:
 
v
a
r
i
a
t
i
o
n
 
i
n
 
t
h
e
 
n
u
m
b
e
r
o
f
 
c
o
n
d
i
t
i
o
n
a
l
 
s
t
r
u
c
t
u
r
e
s
Application
1.
Calculate past volatility values (for each
refactoring opportunity)
2.
Estimate future volatility
3.
Rank suggestions according to this estimate
Evaluation
 
 
Goal: 
To compare the accuracy of the four examined models
 performed along two axes:
 direct comparison of forecast accuracy (RMSE)
 comparison of rankings produced by each model  and according
to the actual volatility
 
Context:
 two open source projects
 JMol: 26 project versions (2004 ..)
 JFreeChart: 15 project versions (2002 ..)
JMol
JFreeChart
Comparison of Forecast Accuracy
 
both consider the
average of all
historical values
Long Method /
JFreeChart
Comparison of Forecast Accuracy
 
R
a
n
d
o
m
 
W
a
l
k
 
i
s
b
e
i
n
g
 
f
a
v
o
r
e
d
 
b
y
s
u
c
c
e
s
s
i
v
e
 
v
e
r
s
i
o
n
s
w
i
t
h
 
z
e
r
o
 
v
o
l
a
t
i
l
i
t
y
 
Peaks in RMSE when
versions with zero
volatility are followed
by abrupt change
Feature Envy /
JMol
Comparison of Forecast Accuracy
Overall RMSE for each smell and forecasting model
 
 Simplicity and relatively good accuracy of HA 
               appropriate strategy for ranking refactoring suggestions
 
 HA achieves the lowest error for Long Method and Feature Envy
 
 more sophisticated models that take proximity into account do
not provide higher accuracy
Ranking Comparison
 
 
Forecasting models extract the anticipated smell volatility for
future software evolution
 Therefore, estimated volatility for the last transition can be
employed as ranking criterion for refactoring suggestions
 
Evaluation
:
 
C
o
m
p
a
r
e
Ranking Comparison
 
 
To compare the similarity between alternative rankings (of the
same set) we used Spearman’s footrule distance
 
N
F
r
 
=
 
0
 
N
F
r
 
=
 
1
 
N
F
r
 
=
 
0
.
3
3
3
Ranking Comparison - Spearman’s footrule
(Long Method / JFreeChart)
(Feature Envy / JMol)
(State Checking / JMol)
high frequency
of changes
low frequency
of changes
low frequency
of changes
Conclusions
 
Refactoring suggestions can be ranked:
 according to design criteria
 according to past source code changes (higher priority for
pieces of code that have been the subject of maintenance)
Simple forecasting models, such as Historical Average
 lowest RMSE error
 similar rankings to those obtained by actual volatility (frequent
changes)
Future Work #1: Analyze history at a more fine-grained level
Future Work #2: Combination of structural and historical criteria
Thank you for your attention
                            15th European Conference on Software Maintenance and Reengineering 
(
CSMR 
2011)
Slide Note
Embed
Share

Design problems in software development can be identified based on non-compliance with design principles, excessive metric values, violations of design heuristics, and lack of design patterns. By assessing the urgency to resolve these problems using past code versions, a ranking mechanism can prioritize refactoring suggestions for improved code maintenance. Drawing inspiration from financial markets, where volatility impacts risk and stability, software preventive maintenance can mitigate risks associated with design issues.


Uploaded on Sep 28, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Ranking Refactoring Suggestions based on Historical Volatility Nikolaos Tsantalis Alexander Chatzigeorgiou University of Macedonia Thessaloniki, Greece 15th European Conference on Software Maintenance and Reengineering (CSMR 2011)

  2. Design Problems non-compliance with design principles excessive metric values violations of design heuristics lack of design patterns Fowler s bad smells

  3. Design Problems can be numerous JFlex 90 80 70 Number of Smells 60 50 Long Method 40 30 Feature Envy 20 State Checking 10 0 1.3 1.3.1 1.3.2 1.3.3 1.3.4 1.3.5 1.4 1.4.1 1.4.2 1.4.3 Versions JFreeChart 400 350 Number of Smells 300 250 200 Long Method 150 100 Feature Envy 50 State Checking 0 Versions

  4. Motivation Are all identified design problems worrying? Example: Why would it be urgent to improve a method suffering from Long Method if the method had never been changed? Need to define (quantify) the urgency to resolve a problem One possible source of information: Past code versions Underlying Philosophy: code fragments that have been subject to maintenance tasks in the past, are more likely to undergo changes refactorings involving the corresponding code should have a higher priority.

  5. Goal To rank refactoring suggestions based on the urgency to resolve the corresponding design problems The ranking mechanism should take into account: the number of past changes the extent of change the proximity to the current version

  6. Inspiration

  7. Forecasting in Financial Markets vs. Software Financial Markets Trends in volatility are more predictable than trends in prices Volatility is related to risk and general stability of markets defined as the relative rate at which prices move up and down time: trading days Software Preventive Maintenance Risk lies in the decision to invest on resolving design problems volatility based on changes involving code affected by a smell time: successive software versions

  8. Code Smell Volatility volatilityi+1 extent of changei-1,i extent of changei,i+1 transitioni transitioni+1 i-1 i i+1 software versions

  9. Forecasting Models Random Walk (RW) = +1 t t Historical Average (HA) += t 1 t 1 = i i t 1 Exponential Smoothing (ES) = ) + a 1 ( + 1 t t t Exponentially-Weighted Moving Average + = t 1 ) 1 ( t 1 = i + a + 1 t t j t 1

  10. Examined Smells Detection tool: JDeodorant Identified smells: Long Method Feature Envy State Checking

  11. Long Method Pieces of code with large size, high complexity and low cohesion int i; int i; int sum = 0; int product = 1; for(i = 0; i < N; ++i) { for(i = 0; i < N; ++i) { sum = sum + i; product = product *i; } } System.out.println(sum); System.out.println(product);

  12. What to look for The presence of Long Method implies that it might be difficult to maintain the method perform refactoring if we expect that the intensity of the smell will change Previous versions: detect changes in the implementation of the method that affect the intensity of the smell change

  13. Long Method Version i Version i+1 int i; int sum = 0; int product = 1; int sumEven = 0; for(i = 0; i < N; ++i) { sum = sum + i; product = product *i; if(i%2 == 0) sumEven += i; } System.out.println(sum); System.out.println(product); System.out.println(sumEven); int i; int sum = 0; int product = 1; for(i = 0; i < N; ++i) { sum = sum + i; product = product *i; } System.out.println(sum); System.out.println(product); Extend of Change: number of edit operations to convert methodito methodi+1

  14. Feature Envy A method is more interested in a class other than the one it actually is in Target Source m1() m2() m3() m() { m(Target t) { t.m1(); t.m2(); t.m3(); } m1(); m2(); m3(); }

  15. Feature Envy The Intensity of the smell is related to the number of envied members Version i Version i+1 Source Source m(Target t) { t.m1(); t.m2(); t.m3(); } m(Target t) { t.m1(); t.m2(); t.m3(); t.m4(); } envies 3 members envies 4 members Extend of Change: variation in the number of envied members

  16. State Checking State Checking manifests itself as conditional statements that select an execution path based on the state of an object Context Context Type type - type : int - STATE_A : int = 1 - STATE_B : int = 2 - STATE_B : int = 2 - type : int - STATE_A : int = 1 +method() + method() { switch(type) { case STATE_A: } + method() { type.method(); doStateA(); break; case STATE_B: doStateB(); StateB StateA +method() { +method() { break; } } } }

  17. What to look for State Checking: implies a missed opportunity for polymorphism State if (state == StateA) { . . . . . . } else if (state == StateB) { . . . . . . } else if (state == StateC) { . . . . . . } StateC . . . . . . . . . StateA StateB . . . . . . . . . (additional statements) + +

  18. State Checking The intensity of the smell is primarily related to the number of conditional structures checking on the same states Version i Version i+1 ClassZ ClassX ClassY - type : int - STATE_A : int = 1 - STATE_B : int = 2 - type : int - STATE_B : int = 2 - STATE_C : int = 3 - type : int - STATE_A : int = 1 - STATE_B : int = 2 - STATE_C : int = 3 + method() { switch(type) { case STATE_A: doStateA(); break; case STATE_B: doStateB(); break; } } + method() { switch(type) { case STATE_B: doStateB(); break; case STATE_C: doStateC(); break; } } + method() { switch(type) { case STATE_A: doStateA(); break; case STATE_B: doStateB(); break; case STATE_C: doStateC(); break; } } 1 cond. structure 2 cond. structures Extend of Change: variation in the number of conditional structures

  19. Application 1. Calculate past volatility values (for each refactoring opportunity) Estimate future volatility Rank suggestions according to this estimate 2. 3.

  20. Evaluation Goal: To compare the accuracy of the four examined models performed along two axes: direct comparison of forecast accuracy (RMSE) comparison of rankings produced by each model and according to the actual volatility Context: two open source projects JMol: 26 project versions (2004 ..) JFreeChart: 15 project versions (2002 ..)

  21. JMol #classes KSLOC 600 200 180 500 160 140 400 #classes 120 KSLOC 300 100 80 200 60 40 100 20 0 0 State Checking Feature Envy 0.14 0.0045 0.004 0.12 Extent of Change (State Checking) Extent of Change (Feature Envy) 0.0035 0.1 0.003 0.08 0.0025 0.002 0.06 0.0015 0.04 0.001 0.02 0.0005 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Transitions between software versions

  22. JFreeChart #classes KSLOC 700 200 180 600 160 500 140 #classes 120 KSLOC 400 100 300 80 60 200 40 100 20 0 0 Long Method 0.06 0.05 Extent of Change 0.04 0.03 0.02 0.01 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Transitions between software versions

  23. Comparison of Forecast Accuracy Long Method / JFreeChart N 1 ( ) = i 2 = RMSE i i N 1 0.07 0.06 0.05 0.04 both consider the average of all historical values RMSE EWMA ES 0.03 HA RW 0.02 0.01 0 1 2 3 4 5 6 7 8 9 10 11 12 Transitions between software versions

  24. Comparison of Forecast Accuracy Feature Envy / JMol N 1 ( ) = i 2 = RMSE i i N Peaks in RMSE when versions with zero volatility are followed by abrupt change 1 0.005 0.004 0.003 RMSE EWMA ES Random Walk is being favored by successive versions with zero volatility HA 0.002 RW 0.001 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Transitions between software versions

  25. Comparison of Forecast Accuracy Overall RMSE for each smell and forecasting model Random Walk Historical Average Exponential Smoothing EWMA Long Method (JFreeChart) Feature Envy (JMol) State Checking (JMol) 0.032646 0.031972 0.032176 0.032608 0.003311 0.003295 0.003309 0.003301 0.052842 0.052967 0.053051 0.053879 HA achieves the lowest error for Long Method and Feature Envy more sophisticated models that take proximity into account do not provide higher accuracy Simplicity and relatively good accuracy of HA appropriate strategy for ranking refactoring suggestions

  26. Ranking Comparison Forecasting models extract the anticipated smell volatility for future software evolution Therefore, estimated volatility for the last transition can be employed as ranking criterion for refactoring suggestions Evaluation: Compare Rankings produced by each model Rankings produced by actual volatility in the last transition

  27. Ranking Comparison To compare the similarity between alternative rankings (of the same set) we used Spearman s footrule distance S ( ) S = , ( ) ( ) Fr i i 1 2 1 2 = 1 i A B C D E F A B C D E F A B C D E F F E D C B A A B C D E F A C B E F D NFr = 0 NFr = 1 NFr = 0.333

  28. Ranking Comparison - Spearmans footrule high frequency of changes (Long Method / JFreeChart) Random Walk Average 0.6220 0.3255 Historical Exponential Smoothing 0.5334 EWMA Actual 0.3238 low frequency of changes (Feature Envy / JMol) Random Walk 0.0096 Historical Average 0.0210 Exponential Smoothing 0.0199 EWMA Actual 0.0213 (State Checking / JMol) low frequency of changes Random Walk 0.07 Historical Average 0.13 Exponential Smoothing 0.14 EWMA Actual 0.13

  29. Conclusions Refactoring suggestions can be ranked: according to design criteria according to past source code changes (higher priority for pieces of code that have been the subject of maintenance) Simple forecasting models, such as Historical Average lowest RMSE error similar rankings to those obtained by actual volatility (frequent changes) Future Work #1: Analyze history at a more fine-grained level Future Work #2: Combination of structural and historical criteria

  30. Thank you for your attention 15th European Conference on Software Maintenance and Reengineering (CSMR 2011)

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#