Finite Automata and Regular Functions in Computer Science

Regular Functions
Rajeev Alur
  
University of Pennsylvania
Languages vs Functions
A language L is a subset of 
*
A numerical function maps strings in 
* to N (or integers Z)
A string-to-string transformation maps 
* to 
*
For Turing-complete models of computation, choice is not critical
“Finite-state” Computation
For language-based view, the definable class is 
regular languages
Many alternative characterizations
Appealing theoretical properties
Finite Automata: Intuitive operational model with efficient
analysis algorithms
Many applications
What is the analog of regularity for defining functions?
Finite Automata with Cost Labels
C: Buy Coffee
S: Fill out a survey
M: End-of-month
C / 2
C / 1
S
M
M
Maps a string over {C,S,M} to a cost value:
 
Cost of a coffee is 2, but reduces to 1 after filling out a
 
survey until the end of the month
Output is computed by implicitly adding up transition costs
Intuitive, analyzable, and many applications
But expressiveness not theoretically robust
S
Finite Automata with Cost Registers
C / x:=x+2
C / x:=x+1
S
M
M
Cost Register Automata:
 
Finite control + Finite number of registers
 
Registers updated explicitly on transitions
 
Registers are 
write-only
 (no tests allowed)
 
Each (final) state associated with output register
x
x:=0
x
S
CRA Example
C / x:=x+2
C / x:=x+1
S
M / x:=0
M / x:=0
At any time, x = costs of coffees 
during the current month
Cost register x reset to 0 at each end-of-month
x
x:=0
x
S
CRA Example
C / x:=x+2
C / x:=x+1
S / x:=y
M / y:=x 
M / y:=x
Filling out a survey gives discounted cost for all the coffees
during that month
x
x,y:=0
x
y:=y+1
S
CRA Example
C / y:=y+1
M / x:=min(x,y); y:=0
Output equals the minimum number of coffees consumed during
a month
 
Updates use two operations: increment and min
min(x,y)
y:=0
x:=Infty
Talk Outline
 Definition of Regular Functions
 Additive Regular Functions
 String Transducers
 Regular Functions over a Semiring
 Conclusions + Open Problems
Cost Model
Cost Grammar G
: Defines a set of terms
 
Inc:  t := c | (t+c)
 
Plus:
 
t := c | (t+t)
 
Min-Inc: t := c | (t+c) | min(t,t)
 
Inc-Scale: t := c | (t+c) | (t*d)
Interpretation []
:
 
Set D of cost values
 
Mapping operators to functions over D
 
Example interpretations for the Plus grammar:
  
Set N of natural numbers with addition
  
Set 
* of strings with concatenation
Regular Cost Function
Definition parameterized by the cost model C=(D,G,[])
A (partial) function f:
*->D is regular w.r.t. the cost model C if
there exists a string-to-tree transformation g such that
 
(1) for all strings w, f(w)=[g(w)]
 
(2) g is a regular string-to-tree transformation
Example Regular Cost Function
Cost grammar Min-Inc: t := c | (t+c) | min(t,t)
Interpretation: Natural numbers with usual meaning of + and min
={C,M}
f(w) = Minimum number of C symbols between successive M’s
Input w=   C C M C C C M
Tree:
Value = 2
Regular String-to-tree Transformations
Definition based on MSO (Monadic Second Order Logic) –
definable graph-to-graph transformations (Courcelle)
Studied in context of syntax-directed program transformations,
attribute grammars, and XML transformations
Operational model: Macro Tree Transducers (Engelfriet et al)
Recent proposals:
  
Streaming String Transducers (POPL 2011)
  
Streaming Tree Transducers (ICALP 2012)
Properties of Regular Cost Functions
Known properties of regular string-to-tree transformations imply:
If f and g are regular w.r.t. a cost model C, and L is a regular
language, then “if L then f else g” is regular w.r.t. C
Reversal: define Rev(f)(w) = f(reverse(w)).
 
If f is regular w.r.t. a cost model C, then so is Rev(f)
Costs grow linearly with the size of the input string:
 
Term corresponding to a string w is O(|w|)
Talk Outline
 Additive Regular Functions
 String Transducers
 Regular Functions over a Semiring
 Conclusions + Open Problems
Regular Cost Functions over Commutative Monoid
Cost model: D with binary function +
Interpretation for + is commutative, associative, with identity 0
Cost grammar G(+): t := c | (t+t)
Cost grammar G(+c): t := c | (t+c)
Thm: Regularity w.r.t. G(+) coincides with regularity w.r.t. G(+c)
Proof intuition: Show that rewriting terms such as (2+3)+(1+5) to
(((2+3)+1)+5) is a regular tree-to-tree transformation, and use
closure properties of tree transducers
Additive Cost Register Automata
Additive Cost Register Automata:
 
DFA + Finite number of registers
 
Each register is initially 0
 
Registers updated using assignments x := y + c
 
Each final state labeled with output term x + c
Given commutative monoid (D,+,0), an ACRA defines a partial
function from 
* to D
 
C / x:=x+2, y:=y+1
C / x:=x+1
S / x:=y
M / y:=x 
M / y:=x
x
x,y:=0
x
S
Regular Cost Functions and ACRAs
Thm: Given a commutative monoid (D,+,0), a function f:S*->D is
definable using an ACRA iff it is regular w.r.t. grammar G(+).
Establishes ACRA as an intuitive, deterministic operational
model to define this class of regular functions
Proof relies on the model of SSTT (Streaming string-to-tree
transducers) that can define all regular string-to-tree
transformations
Single-Valued Weighted Automata
Weighted Automata:
 
Nondeterministic automata with edges labeled with costs
Single-valued:
 
Each string has at most one accepting path
Cost of a string:
 
Sum of costs of transitions along the accepting path
Example: When you fill out a survey, each coffee during that
month gets the discounted cost.
 
Locally nondeterministic, but globally single-valued
Thm: ACRAs and single-valued weighted automata define the
same class of functions
Decision Problems for ACRAs
Min-Cost: Given an ACRA M, find min {M(w) | w in 
*}
 
Solvable in Polynomial-time
 
Shortest path in a graph with vertices (state, register)
Equivalence: Do two ACRAs define the same function
 
Solvable in Polynomial-time
 
Based on propagation of linear equalities in program graphs
Register Minimization: Given an ACRA M with k registers, is
there an equivalent ACRA with < k registers?
 
Algorithm polynomial in states, and exponential in k
Towards a Theory of Additive Regular Functions
Goal: Machine-independent characterization of regularity
 
Similar to Myhill-Nerode theorem for regular languages
 
Registers should compute necessary auxiliary functions
Example: 
 = {C,S}
 
f(w)= if w contains S then |w| else 2|w|
 
f
1
(C
i
)=i and f
2
(C
i
)=2i are necessary and sufficient
Thm: Register complexity of a function is at least k iff there
exist strings 
0
, … 
m
, loop-strings 
1
,…
m
, and suffixes w
1
,…w
m
,
and k distinct vectors 
c
1
,…c
k
 
such that for all numbers x
1
,…x
m
,
f(
0
 
1
x1
 
1
 
2
x2
m
 w
i
) = 
j
 c
ij
 x
j
 + d
i
Talk Outline
 
String Transducers
 Regular Functions over a Semiring
 Conclusions + Open Problems
Regular Functions for Non-Commutative Monoid
Cost model: 
* with binary function concatenation .
Interpretation for . is non-commutative, associative, identity 
Cost grammar G(.): t := 
 | (t . t)
  
 is a string
Cost grammar G(.
): t := 
 | (t . 
) | (
 . t)
Thm: Regular functions w.r.t G(.) is a strict superset of regular
functions w.r.t. G(.
)
Classical model of Sequential Transducers captures only a subset
of regular functions w.r.t. G(.
)
Streaming String Transducer: Delete
Finite state control + register x ranging over output strings
 
String variables explicitly updated at each step
Delete all a symbols
output x
a / x := x
x := 
b / x := xb
Streaming String Transducer: Reverse
Symbols may be added to string variables at both ends
output x
a / x := ax
x := 
b / x := bx
Streaming String Transducer: Regular Look Ahead
If input ends with b, then delete all a symbols, else reverse
output x
a / x:=ax
x,y := 
output y
b / x:=bx; y:=yb
b / x:=bx; y:=yb
a / x:=ax
Register x equals reverse of the input so far
Register  y equals input so far with all a’s deleted
Streaming String Transducer: Concatenation
Registers can be concatenated
Example: Swap substring before first a with substring
following last a
   
a
  
a
  
   
a
 
 
 
a
 
Key restriction: a variable can appear at most once on RHS
        [x,y] := [xy,  
] allowed
  
 
    [x,y] := [xy, y] not allowed
SST Properties
At each step, one input symbol is processed, and at most a
constant number of output symbols are newly created
Output is bounded: Length of output = O(length of input)
SST transduction can be computed in linear time
Finite-state control: Registers not examined
SST cannot implement merge
 
        
f(u
1
u
2
….u
k
#v
1
v
2
…v
k
) = u
1
v
1
u
2
v
2
….u
k
v
k
Multiple registers are essential
 
       For f(w)=w
k
, k variables are necessary and sufficient
Decision Problem: Type Checking
Pre/Post condition assertion: { L }  S  { L’ }
 
Given a regular language L of input strings (pre-condition), an
SST S, and a regular language L’ of output strings (post-
condition), verify that for every w in L, S(w) is in L’
Thm: Type checking is solvable in polynomial-time
 
Key construction: Summarization
 
Decision Problem: Equivalence
Functional Equivalence;
 
Given SSTs S and S’ over same input/output alphabets,
 
check whether they define the same transductions.
Thm: Equivalence is solvable in PSPACE
 
(polynomial in states, but exponential in # of string variables)
 
No lower bound known
 
Expressiveness
Thm: A string transduction is definable by an SST iff it is regular
 
    1. SST definable transduction is MSO definable
    2. MSO definable transduction can be captured by a two-way
            transducer (Engelfriet/Hoogeboom 2001)
    3. SST can simulate a two-way transducer
Evidence of robustness of class of regular transductions
Closure properties
 
1. Sequential composition: f
1
(f
2
(w))
    2. Regular conditional choice: if w in L then f
1
(w) else f
2
(w)
 
SST Applications
Equivalent class of single pass list processing programs with
solvable program analysis problems (POPL 2011)
Algorithmic verification of retransmission protocols (network
components as regular transformers over bit sequences;
FORTE 2013)
Opportunities
BEK: Transducer-based tool for analyzing string sanitizers
FlashFill: Learning string transformations from examples
function delete
  input ref curr;
  input data v;
  output ref result;
  output bool flag := 0;
  local ref prev;
  while (curr != nil) & (curr.data = v) {
      curr := curr.next;
      flag := 1;
      }
  result := curr;
  prev:= curr;
  if (curr != nil) then {
     curr := curr.next;
     prev.next := nil;
     while (curr != nil) {
         if (curr.data = v) then {
             curr := curr.next;
             flag := 1;
             }
         else {
             prev.next := curr;
             prev := curr;
             curr := curr.next;
             prev.next := nil;
             }
  }
Decidable Analysis:
  1. Assertion checks
  2. Pre/post condition
  3. Full functional correctness
Algorithmic Verification of List-processing Programs
head
tail
3
8
2
Talk Outline
 
Regular Functions over a Semiring
 Conclusions + Open Problems
Regular Cost Functions over Semiring
Cost Domain: Natural numbers + Infty
Operation Min: Commutative monoid with identity Infty
Operation +: Monoid with identity 0
Rules: a + Infty = Infty + a = Infty
 
a+min(b,c) = min (a+b, a+c); min(b,c)+a = min(b+a,c+a)
Cost grammar MinInc: t := c | min(t,t) | (t+c)
Goal: Understand class of regular functions w.r.t. MinInc
Weighted Automata
Weighted Automata:
 
Nondeterministic automata with edges labeled with costs
Interpreted over the semiring cost model:
 
cost of string w = min of costs of all accepting paths over w
 
cost of a path = sum of costs of all edges in a path
Widely studied (Weighted Automata, Droste et al)
 
Minimum cost problem solvable
 
Equivalence undecidable over (N, min, +)
 
Not determinizable
 
Natural model in many applications
 
Recent interest in CAV community for quantitative analysis
CRA over Min-Inc Semiring
C / y:=y+1
M / x:=min(x,y); y:=0
Output equals the minimum number of coffees consumed during
a month
min(x,y)
y:=0
x:=Infty
CRA(min,+c) = Weighted Automata
From WA to CRA(min,+c):
 
Generalizes subset construction for determinization
 
For every state q of WA, CRA maintains a register x
q
 
x
q
 = min of costs of all paths to q on input read so far
 
Update on a: x
q
 := min { x
p
 + c | p –(a,c)-> q is edge in WA
}
From CRA(min,+c) to WA:
 
State of WA = (state q of CRA, register x)
 
min simulated by nondeterminism
 
To simulate p – (a, x:=min(y,z)) -> q  in CRA,
  
add a-labeled edges from (p,y) and (p,z) to (q,x)
 
Distributivity of + over min critical
CRA(min,+c) > Min-Plus Regular Functions
Thm: The class of regular functions w.r.t. Min-Inc semiring is a
strict subset of weighted automata
Above function is not regular: cost term is quadratic in input
a/1
b/1
#
b,#
a,#
Input w: w
1
 # w
2
 # … # w
n
Each w
i
 in {a,b}*
a
i
 = Number of a’s in w
i
b
i
 = Number of b’s in w
i
Cost(w) = min
j
 { a
1
+…+a
j
+b
j+1
+…+b
n
}
Machine Model for Semiring Regular Functions
Updates to registers must be copyless
 
Each register appears at most once in a right-hand-side
 
Update [x,y] := [min(x,y),y] not allowed
 
Necessary to maintain “linear” growth
Need ability to simulate substitution
 
Register x carries two values c and d
 
Stands for the parameterized expression min(c, ?)+d
 
Besides min and inc, can substitute ? with a value
Resulting model coincides with regular functions over semiring
Open: Decidability of equivalence over (N, min , +c)
Talk Outline
 
Conclusions + Open Problems
Discounted Cost Regular Functions
Basic element: (cost c, discount d)
Discounted sum: (c
1
,d
1
)*(c
2
,d
2
) = (c
1
+d
1
c
2
, d
1
d
2
)
Example of non-commutative monoid
Classical Model: Future discounting
 
Cost of a path: (c
1
,d
1
) * (c
2
,d
2
) * … * (c
n
,d
n
)
 
Polynomial-time algorithm for “generalized” shortest path
Past discounting
 
Cost of a path: (c
n
,d
n
) * (c
n-1
,d
n-1
) * … * (c
1
,d
1
)
 
Same PTIME algorithm works for shortest paths
Prioritized double discounting
 
Cost = (c
1
,d
1
) * … * (c
n
, d
n
) * (c’
1
,d’
1
) * … * (c’
n
,d’
n
)
 
Shortest path: NExpTime algorithm
Open: Shortest path for Discounted Cost Register Automata
Open Problems and Challenges
Complexity of equivalence of SSTs and STTs
Large gap between lower and upper bounds
Machine-independent characterization of regularity
Support functions needed to compute a function
Decidability of min-cost for discounted cost automata
Decidability of equivalence for Copyless CRAs over (N,min,+c
)
Simpler/cleaner proofs of equivalence of machine models and
MSO-definable transformations
Unexplored Directions
Probabilistic models
 
Markov chains / MDPs with regular rewards
Regular costs for infinite executions
 
Infinitary operators: Lim-average, Discounted-sum
 
Starting point: Infinite-String-to-Tree Transducers
Regular costs for trees
Combinations of other operations
 
Regular functions over G(+,min): t := c | (t+t) | min(t,t)
Conclusions
Cost Register Automata
Write-only machines with multiple registers to store outputs
 
Regular Functions
Definition parameterized by allowed operations
Based on MSO-definable graph transformations / transducers
Emerging theory
Some results, new connections
Many open problems and unexplored directions
Acknowledgements and References
Streaming String Transducers
  
(with P. Cerny; POPL’11, FSTTCS’10)
Transducers over Infinite Strings
  
(with E. Filiot, A. Trivedi; LICS’12)
Streaming Tree Transducers
  
(with L. D’Antoni; ICALP’12)
Regular Functions and Cost Register Automata
 
(with L. D’Antoni, J. Deshmukh, M. Raghothaman, Y. Yuan; LICS’13)
Decision problems for Additive Cost Regular Functions
 
(with M. Raghothaman; ICALP’13)
Infinite-String to Infinite-Term Regular Transformations
 
(with A. Durand, A. Trivedi; LICS’13)
Min-cost problems for Discounted Sum Regular Functions
 
(with S. Kannan, K. Tian, Y. Yuan; LATA’13)
Slide Note
Embed
Share

Exploring the concepts of regular functions, languages vs functions, finite-state computation, finite automata with cost labels, finite automata with cost registers, and examples of Cost Register Automata. These topics delve into the theoretical and practical aspects of defining functions and computational models in computer science. The content touches upon intuitive operational models, efficient analysis algorithms, control structures, and output computations associated with various automata types.

  • Automata
  • Regular Functions
  • Finite-state Computation
  • Computer Science
  • Cost Registers

Uploaded on Sep 12, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Regular Functions Rajeev Alur University of Pennsylvania

  2. Languages vs Functions A language L is a subset of * A numerical function maps strings in * to N (or integers Z) A string-to-string transformation maps * to * For Turing-complete models of computation, choice is not critical

  3. Finite-state Computation For language-based view, the definable class is regular languages Many alternative characterizations Appealing theoretical properties Finite Automata: Intuitive operational model with efficient analysis algorithms Many applications What is the analog of regularity for defining functions?

  4. Finite Automata with Cost Labels C / 2 C / 1 C: Buy Coffee S: Fill out a survey M: End-of-month S M S M Maps a string over {C,S,M} to a cost value: Cost of a coffee is 2, but reduces to 1 after filling out a survey until the end of the month Output is computed by implicitly adding up transition costs Intuitive, analyzable, and many applications But expressiveness not theoretically robust

  5. Finite Automata with Cost Registers C / x:=x+2 C / x:=x+1 S x:=0 x x M M S Cost Register Automata: Finite control + Finite number of registers Registers updated explicitly on transitions Registers are write-only (no tests allowed) Each (final) state associated with output register

  6. CRA Example C / x:=x+2 C / x:=x+1 S x:=0 x x M / x:=0 M / x:=0 S At any time, x = costs of coffees during the current month Cost register x reset to 0 at each end-of-month

  7. CRA Example C / x:=x+2 y:=y+1 C / x:=x+1 S / x:=y x,y:=0 x x M / y:=x M / y:=x S Filling out a survey gives discounted cost for all the coffees during that month

  8. CRA Example C / y:=y+1 x:=Infty min(x,y) y:=0 M / x:=min(x,y); y:=0 Output equals the minimum number of coffees consumed during a month Updates use two operations: increment and min

  9. Talk Outline Definition of Regular Functions Additive Regular Functions String Transducers Regular Functions over a Semiring Conclusions + Open Problems

  10. Cost Model Cost Grammar G: Defines a set of terms Inc: t := c | (t+c) Plus: t := c | (t+t) Min-Inc: t := c | (t+c) | min(t,t) Inc-Scale: t := c | (t+c) | (t*d) Interpretation []: Set D of cost values Mapping operators to functions over D Example interpretations for the Plus grammar: Set N of natural numbers with addition Set * of strings with concatenation

  11. Regular Cost Function Definition parameterized by the cost model C=(D,G,[]) A (partial) function f: *->D is regular w.r.t. the cost model C if there exists a string-to-tree transformation g such that (1) for all strings w, f(w)=[g(w)] (2) g is a regular string-to-tree transformation

  12. Example Regular Cost Function Cost grammar Min-Inc: t := c | (t+c) | min(t,t) Interpretation: Natural numbers with usual meaning of + and min ={C,M} f(w) = Minimum number of C symbols between successive M s Input w= C C M C C C M min min Tree: + + + + + Value = 2 0 1 1 Infty 1 0 1 1

  13. Regular String-to-tree Transformations Definition based on MSO (Monadic Second Order Logic) definable graph-to-graph transformations (Courcelle) Studied in context of syntax-directed program transformations, attribute grammars, and XML transformations Operational model: Macro Tree Transducers (Engelfriet et al) Recent proposals: Streaming String Transducers (POPL 2011) Streaming Tree Transducers (ICALP 2012)

  14. Properties of Regular Cost Functions Known properties of regular string-to-tree transformations imply: If f and g are regular w.r.t. a cost model C, and L is a regular language, then if L then f else g is regular w.r.t. C Reversal: define Rev(f)(w) = f(reverse(w)). If f is regular w.r.t. a cost model C, then so is Rev(f) Costs grow linearly with the size of the input string: Term corresponding to a string w is O(|w|)

  15. Talk Outline Additive Regular Functions String Transducers Regular Functions over a Semiring Conclusions + Open Problems

  16. Regular Cost Functions over Commutative Monoid Cost model: D with binary function + Interpretation for + is commutative, associative, with identity 0 Cost grammar G(+): t := c | (t+t) Cost grammar G(+c): t := c | (t+c) Thm: Regularity w.r.t. G(+) coincides with regularity w.r.t. G(+c) Proof intuition: Show that rewriting terms such as (2+3)+(1+5) to (((2+3)+1)+5) is a regular tree-to-tree transformation, and use closure properties of tree transducers

  17. Additive Cost Register Automata C / x:=x+2, y:=y+1 x,y:=0 C / x:=x+1 S / x:=y x x M / y:=x S M / y:=x Additive Cost Register Automata: DFA + Finite number of registers Each register is initially 0 Registers updated using assignments x := y + c Each final state labeled with output term x + c Given commutative monoid (D,+,0), an ACRA defines a partial function from * to D

  18. Regular Cost Functions and ACRAs Thm: Given a commutative monoid (D,+,0), a function f:S*->D is definable using an ACRA iff it is regular w.r.t. grammar G(+). Establishes ACRA as an intuitive, deterministic operational model to define this class of regular functions Proof relies on the model of SSTT (Streaming string-to-tree transducers) that can define all regular string-to-tree transformations

  19. Single-Valued Weighted Automata Weighted Automata: Nondeterministic automata with edges labeled with costs Single-valued: Each string has at most one accepting path Cost of a string: Sum of costs of transitions along the accepting path Example: When you fill out a survey, each coffee during that month gets the discounted cost. Locally nondeterministic, but globally single-valued Thm: ACRAs and single-valued weighted automata define the same class of functions

  20. Decision Problems for ACRAs Min-Cost: Given an ACRA M, find min {M(w) | w in *} Solvable in Polynomial-time Shortest path in a graph with vertices (state, register) Equivalence: Do two ACRAs define the same function Solvable in Polynomial-time Based on propagation of linear equalities in program graphs Register Minimization: Given an ACRA M with k registers, is there an equivalent ACRA with < k registers? Algorithm polynomial in states, and exponential in k

  21. Towards a Theory of Additive Regular Functions Goal: Machine-independent characterization of regularity Similar to Myhill-Nerode theorem for regular languages Registers should compute necessary auxiliary functions Example: = {C,S} f(w)= if w contains S then |w| else 2|w| f1(Ci)=i and f2(Ci)=2i are necessary and sufficient Thm: Register complexity of a function is at least k iff there exist strings 0, m, loop-strings 1, m, and suffixes w1, wm, and k distinct vectors c1, cksuch that for all numbers x1, xm, f( 0 1x1 1 2x2 m wi) = j cij xj + di

  22. Talk Outline String Transducers Regular Functions over a Semiring Conclusions + Open Problems

  23. Regular Functions for Non-Commutative Monoid Cost model: * with binary function concatenation . Interpretation for . is non-commutative, associative, identity Cost grammar G(.): t := | (t . t) is a string Cost grammar G(. ): t := | (t . ) | ( . t) Thm: Regular functions w.r.t G(.) is a strict superset of regular functions w.r.t. G(. ) Classical model of Sequential Transducers captures only a subset of regular functions w.r.t. G(. )

  24. Streaming String Transducer: Delete Finite state control + register x ranging over output strings String variables explicitly updated at each step Delete all a symbols a / x := x x := output x b / x := xb

  25. Streaming String Transducer: Reverse Symbols may be added to string variables at both ends a / x := ax x := output x b / x := bx

  26. Streaming String Transducer: Regular Look Ahead If input ends with b, then delete all a symbols, else reverse a / x:=ax b / x:=bx; y:=yb b / x:=bx; y:=yb x,y := output x output y a / x:=ax Register x equals reverse of the input so far Register y equals input so far with all a s deleted

  27. Streaming String Transducer: Concatenation Registers can be concatenated Example: Swap substring before first a with substring following last a a a a a Key restriction: a variable can appear at most once on RHS [x,y] := [xy, ] allowed [x,y] := [xy, y] not allowed

  28. SST Properties At each step, one input symbol is processed, and at most a constant number of output symbols are newly created Output is bounded: Length of output = O(length of input) SST transduction can be computed in linear time Finite-state control: Registers not examined SST cannot implement merge f(u1u2 .uk#v1v2 vk) = u1v1u2v2 .ukvk Multiple registers are essential For f(w)=wk, k variables are necessary and sufficient

  29. Decision Problem: Type Checking Pre/Post condition assertion: { L } S { L } Given a regular language L of input strings (pre-condition), an SST S, and a regular language L of output strings (post- condition), verify that for every w in L, S(w) is in L Thm: Type checking is solvable in polynomial-time Key construction: Summarization

  30. Decision Problem: Equivalence Functional Equivalence; Given SSTs S and S over same input/output alphabets, check whether they define the same transductions. Thm: Equivalence is solvable in PSPACE (polynomial in states, but exponential in # of string variables) No lower bound known

  31. Expressiveness Thm: A string transduction is definable by an SST iff it is regular 1. SST definable transduction is MSO definable 2. MSO definable transduction can be captured by a two-way transducer (Engelfriet/Hoogeboom 2001) 3. SST can simulate a two-way transducer Evidence of robustness of class of regular transductions Closure properties 1. Sequential composition: f1(f2(w)) 2. Regular conditional choice: if w in L then f1(w) else f2(w)

  32. SST Applications Equivalent class of single pass list processing programs with solvable program analysis problems (POPL 2011) Algorithmic verification of retransmission protocols (network components as regular transformers over bit sequences; FORTE 2013) Opportunities BEK: Transducer-based tool for analyzing string sanitizers FlashFill: Learning string transformations from examples

  33. Algorithmic Verification of List-processing Programs function delete input ref curr; input data v; output ref result; output bool flag := 0; local ref prev; while (curr != nil) & (curr.data = v) { curr := curr.next; flag := 1; } result := curr; prev:= curr; if (curr != nil) then { curr := curr.next; prev.next := nil; while (curr != nil) { if (curr.data = v) then { curr := curr.next; flag := 1; } else { prev.next := curr; prev := curr; curr := curr.next; prev.next := nil; } } head tail 3 8 2 Decidable Analysis: 1. Assertion checks 2. Pre/post condition 3. Full functional correctness

  34. Talk Outline Regular Functions over a Semiring Conclusions + Open Problems

  35. Regular Cost Functions over Semiring Cost Domain: Natural numbers + Infty Operation Min: Commutative monoid with identity Infty Operation +: Monoid with identity 0 Rules: a + Infty = Infty + a = Infty a+min(b,c) = min (a+b, a+c); min(b,c)+a = min(b+a,c+a) Cost grammar MinInc: t := c | min(t,t) | (t+c) Goal: Understand class of regular functions w.r.t. MinInc

  36. Weighted Automata Weighted Automata: Nondeterministic automata with edges labeled with costs Interpreted over the semiring cost model: cost of string w = min of costs of all accepting paths over w cost of a path = sum of costs of all edges in a path Widely studied (Weighted Automata, Droste et al) Minimum cost problem solvable Equivalence undecidable over (N, min, +) Not determinizable Natural model in many applications Recent interest in CAV community for quantitative analysis

  37. CRA over Min-Inc Semiring C / y:=y+1 x:=Infty min(x,y) y:=0 M / x:=min(x,y); y:=0 Output equals the minimum number of coffees consumed during a month

  38. CRA(min,+c) = Weighted Automata From WA to CRA(min,+c): Generalizes subset construction for determinization For every state q of WA, CRA maintains a register xq xq = min of costs of all paths to q on input read so far Update on a: xq := min { xp + c | p (a,c)-> q is edge in WA} From CRA(min,+c) to WA: State of WA = (state q of CRA, register x) min simulated by nondeterminism To simulate p (a, x:=min(y,z)) -> q in CRA, add a-labeled edges from (p,y) and (p,z) to (q,x) Distributivity of + over min critical

  39. CRA(min,+c) > Min-Plus Regular Functions Input w: w1 # w2# # wn Each wi in {a,b}* ai= Number of a s in wi bi= Number of b s in wi a/1 b/1 # a,# b,# Cost(w) = minj { a1+ +aj+bj+1+ +bn} Thm: The class of regular functions w.r.t. Min-Inc semiring is a strict subset of weighted automata Above function is not regular: cost term is quadratic in input

  40. Machine Model for Semiring Regular Functions Updates to registers must be copyless Each register appears at most once in a right-hand-side Update [x,y] := [min(x,y),y] not allowed Necessary to maintain linear growth Need ability to simulate substitution Register x carries two values c and d Stands for the parameterized expression min(c, ?)+d Besides min and inc, can substitute ? with a value Resulting model coincides with regular functions over semiring Open: Decidability of equivalence over (N, min , +c)

  41. Talk Outline Conclusions + Open Problems

  42. Discounted Cost Regular Functions Basic element: (cost c, discount d) Discounted sum: (c1,d1)*(c2,d2) = (c1+d1c2, d1d2) Example of non-commutative monoid Classical Model: Future discounting Cost of a path: (c1,d1) * (c2,d2) * * (cn,dn) Polynomial-time algorithm for generalized shortest path Past discounting Cost of a path: (cn,dn) * (cn-1,dn-1) * * (c1,d1) Same PTIME algorithm works for shortest paths Prioritized double discounting Cost = (c1,d1) * * (cn, dn) * (c 1,d 1) * * (c n,d n) Shortest path: NExpTime algorithm Open: Shortest path for Discounted Cost Register Automata

  43. Open Problems and Challenges Complexity of equivalence of SSTs and STTs Large gap between lower and upper bounds Machine-independent characterization of regularity Support functions needed to compute a function Decidability of min-cost for discounted cost automata Decidability of equivalence for Copyless CRAs over (N,min,+c) Simpler/cleaner proofs of equivalence of machine models and MSO-definable transformations

  44. Unexplored Directions Probabilistic models Markov chains / MDPs with regular rewards Regular costs for infinite executions Infinitary operators: Lim-average, Discounted-sum Starting point: Infinite-String-to-Tree Transducers Regular costs for trees Combinations of other operations Regular functions over G(+,min): t := c | (t+t) | min(t,t)

  45. Conclusions Cost Register Automata Write-only machines with multiple registers to store outputs Regular Functions Definition parameterized by allowed operations Based on MSO-definable graph transformations / transducers Emerging theory Some results, new connections Many open problems and unexplored directions

  46. Acknowledgements and References Streaming String Transducers (with P. Cerny; POPL 11, FSTTCS 10) Transducers over Infinite Strings (with E. Filiot, A. Trivedi; LICS 12) Streaming Tree Transducers (with L. D Antoni; ICALP 12) Regular Functions and Cost Register Automata (with L. D Antoni, J. Deshmukh, M. Raghothaman, Y. Yuan; LICS 13) Decision problems for Additive Cost Regular Functions (with M. Raghothaman; ICALP 13) Infinite-String to Infinite-Term Regular Transformations (with A. Durand, A. Trivedi; LICS 13) Min-cost problems for Discounted Sum Regular Functions (with S. Kannan, K. Tian, Y. Yuan; LATA 13)

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#